программистское
May. 16th, 2007 01:46 amВсе-таки забавно, насколько использование виртуальных методов убивает скорость. Prefetch queue - наше все.
Скажем, алгоритм обрабатывает большие массивы памяти в цикле, который делает относительно мало работы, а вызывается очень много раз. Итератор цикла - виртуальный метод вспомогательного класса, который вызывает еще пару виртуальных методов, и все они делают очень мало работы. Ничего остального не меняя, делаем эти методы невиртуальными, и алгоритм бежит в пять раз быстрее.
no subject
Date: 2007-05-16 04:41 pm (UTC)To elaborate a bit on my original statement, JVM can only translate a virtual function call to a direct/inline call iff it can infer from the code that dynamic type and static type at the call site are always the same. It cannot assume that just because previous 100 calls were to implementation "a()", the next call will also be to "a()". As a result, what JVM can do at runtime, a C++ compiler can do at compile time. Here is an example:
struct b { virtual void f () = 0; }; struct d1: b { virtual void f () {...} }; struct d2: b { virtual void f () {...} }; int main () { typedef vector<b*> bv; bv v; v.push_back (new d1); v.push_back (new d2); d1 d; for (bv::iterator i (v.begin ()); i != v.end (); ++i) { i->f (); // Neither JVM nor C++ can optimize something like this. d.f (); // Both JVM and C++ can optimize something like this. } }no subject
Date: 2007-05-16 06:24 pm (UTC)my general point remains, however: it *is* possible for a dynamic compiler to do much better than this; some, in fact, do.
no subject
Date: 2007-05-17 10:16 am (UTC)No, I wasn't.
it *is* possible for a dynamic compiler to do much better than this; some, in fact, do.
Do you "just know" or you can show a situation where a run-time compiler can translate a virtual call to a direct/inline one while a traditional compiler cannot?
no subject
Date: 2007-05-17 10:51 am (UTC)this is not rocket science; the only reason this is not done in "traditional" OO languages is because the users have been trained not to demand such stuff.
again, I don't see what you are trying to say here: that such optimizations are impossible (incorrect) or that the current state of the art in the C++/Java world is kind of pathetic (perhaps so, but so what? once a need is recognized widely enough, it tends to be fulfilled)?
no subject
Date: 2007-05-17 02:04 pm (UTC)no subject
Date: 2007-05-17 02:20 pm (UTC)this is nonsense. firstly, you don't need to look into the future; the past and present are quite enough. either you know that there is only one method that fits the constraints (resulting in naked direct call), or you know which method was called last from your call site, so you cache it. pretty simple, really, especially for C++ or Java, where all you need to know in order to decide to use the cached method or not is to check the type of one object.
> general hand-waving
in what way is what I said hand-waving? I can dig up some code and disassemble it for you easily enough, but do consider the situation: an anonymous poster wants me to spend an effort in order to illustrate something bloody obvious. now, why should I do that?
no subject
Date: 2007-05-17 02:42 pm (UTC)Ok, I give up. The quoted text shows that you have no idea how virtual function mechanism works. When you call a virtual function, it is translated to an indirect call, that is, a call to a function who's address is stored somehere in memory:
So now you suggest that we cache the address of the last function called (plus some typeid of type to which it belongs, but you didn't mention that because you have no clue), then check on the next call if the types are the same (that is, obtain the typeid of an object and compare it to the stored) and finally call the cached function:
if (typeid (i) == cached_typeid) { cached_function (i); } else i->vtbl[0] (i);Now guess which one will be faster? Hint: just obtaining and comparing typeid's will be slower than the whole indirect call.
So instead of talking about stuff you have no clue about, why don't you just go to reddit or something and upvote all those list/java/haskel posts.
no subject
Date: 2007-05-17 02:49 pm (UTC)that's the point: not for the situation under discussion here, because from the fact that such an enormous overhead is incurred when calling a virtual method it is obvious that the vtable is not in the processor cache (why this is so is another interesting question)! but the object whose typeid you need is there. thus, accessing the object's typeid is much cheaper than accessing the vtable.
no subject
Date: 2007-05-17 02:41 pm (UTC)for C++, you are right, because there's no way to add classes at runtime anyway. so failing to cache virtual method calls is even less excusable than it would be otherwise.
now if you'll excuse me, I'll go look for those remedial reading comprehension classes. :/