discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)


From: Helge Hess
Subject: Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)
Date: Wed, 29 Oct 2003 02:57:16 +0100

On 29.10.2003, at 01:45, Patrick Coskren wrote:
But in an RDBMS, many of the values are relations to other sets. In other words, you have a bunch of rows linked in some graph.

No. You have sets.

I fail to see where rows vs. objects is a critical distinction, unless you're complaining about the fact that the list of values in an object is static rather than dynamic.

Not sure. Yes, the value set of RDBMS queries is completely dynamic, thats true. And it is also true that the attribute set of an object is inherently static (at least for any concrete method) a) results sets of RDBMS selects are usually partial (thats why they are called *selects*, not because you can specify the filter but because you can select which columns you want to have)
b) results sets of RDBMS often span rows

If that's what you're saying, then I can see the point, but in my experience having an object model for your data makes the overall application far more maintainable.

Exactly, you got it! ;-) But it also breaks scalability and performance. EOF is very nice if it would work out, but it doesn't.

In some cases I've had that object wrap the results of a Perl DBI call (let's say), but I've still had the object. I can see why people prefer a more dynamic approach.

Yes, it certainly does make sense to wrap result sets into own classes. But this doesn't really match EOF either.

EOF is really built around the principles that object identity is defined by a primary key and that an entity maps to one or more tables which are not really allowed to map to other tables. Otherwise you will get trouble with the framework at various places (and if you really used EOF you know that very well ;-)

Only that this completely breaks identity and the relationships. Now you can also to some degree do subclassing in EOF, but this would even break semantics.
I don't understand what you mean.

If you have a second artificial entity for a limited set of attributes you immediatly get two different ObjC objects being mapped to a single primary key which will destroy a whole lot of assumptions of EOF. On common sideeffect is that you run into cache conflicts.

As for breaking relationships, why couldn't your second entity define the relationships it needs, same as the first entity?

So that you get toSubsetEntity and toEntity?? If you don't call this a hack ...

Yes, you have to redo some references, but the tools make it easy. As for identity, I don't see it being a problem as long as you never have more than one object in the same EOEditingContext for a particular table row.

If you have multiple overlapping entities keyed on a single primary key, which you are suggesting as a "regular way to deal with the problem", you will actually have multiple objects being keyed to a primary key.

But, if I'm remembering correctly, that's a restriction even when you only have one entity for a given table.

And why would subclassing "break semantics"? Because you're more accurately talking about a subsetting operation than a subclassing one?

Exactly. Eg my example was something like a "SomePersonAndSomeAddressProperties" entity. Inheriting the "Person" entity from this would be wrong, since a Person only has a relationship to an Address, and the other way around is wrong either, since "SomePersonAndSomeAddressProperties" doesn't have all the properties required for a Person method. In short: RDBMS result sets do not match at all to the object class system.

Besides that, since something like a tableview is usually freely configurable from the available attribute set you would need to recreate a dynamic entity all the time. Hack.
If you like. I personally do this manually (if I'm understanding you) even when I'm using raw database calls in a non-EOF context. I find it makes the code clearer.

Well, now I cannot follow you. I mean in a tableview a user can usually select in some preferences which attributes are going to be shown. And exactly those should be fetched.
This is a no brainer since a tableview maps 1:1 to a RDBMS result set.

Hm, to me the clearest code is always the code which shows what is actually happening. And both variants are one or two liners - the EOF variant is not more "expressive" for set operations. Well, actually SQL is optimized for set operations and you

This does not only break interobject relationships, it also breaks internal encapsulation of the objects. That is, you cannot really reuse/write methods anymore.
I'll give you this one. Subclassing might get around this, but I never used it much.

Well, as lined out above, not really, since Subclassing is not appropriate for joins.

In 5.0, the support was still a little shaky, and that was the last time I used EOF.

EOF has a WHOLE lot of bugs which is due its internal complexity.

No. I mean usual amounts of rows for RDBMS (10000+ at least) with usual amounts of columns (I would say around 10+)
I think I'm following you: you're not worried about the size of the table per se, but the size of the result set. Yes?

Yes. If you want to follow OO paradigms (and this is the point of EOF), you will only fetch complete objects (and their relationships) which map 1:1 to a primary key.

and of course with a result set being joined (which is the hardest part for OO mapping).
It is? I always found that the easiest. purchaseOrder().customer().phoneNumber(). Joining 3 tables.

Only that this is not a usual query ;-)

But I'm still trying to figure out why we have such different perceptions of EOF's scalability.

I think we get to this below ...

But again, I do *really* not expect to be able to convince you.
Well, you never know.

;-)

Sure, there are times when, for performance reasons, you want to load raw rows.
Hm, I wonder in which cases you would resort to that if EOF can deal quite fine with the issues I raised. Where is the performance barrier in your opinion? When did you need to resort to raw rows?

In thinking about this, I think I begin to see where you're coming from. When I used EOF, it was in the context of WebObjects.

Me too, don't know OpenGroupware.org? ;-). Actually the issues are much *less* in a desktop context since the large memory load of EOF is distributed between multiple machines. Yet the advantages of a RDBMS do not play out in that context either.

In a web app, you'd rarely have a result set with 10,000 items,

When you are dealing with less than 10000 items you really don't need a RDBMS. How big are those items? You can store all of them in memory! And querying them using EOQualifier is very fast as well. You need distribution? Use DO. Want something a bit more DB like? use Berkley DB.
I do not install a RDBMS for 10000 rows.

BTW: thats part of the reason why Zope is an actual killer. It uses appropriate storage technology. ZODB can easily deal with 10000 objects and a lot of web apps really do not need more than that.

because you couldn't reasonably display that much data on a web page.

You need to query that data nevertheless. This is especially easy with WebObjects which allows to build a scrollable table view and users want that.

A case where you might need to resort to raw rows is if you want your app to generate a report for the user to download, and you really would need to process a result set with that many items.

I don't know what you call a report, but I guess in your definition almost any regular query page in OGo would be a "report" (most notably the scheduler views).

On the desktop, however, I can see where you would have 10,000 items in a result set easily. To use your mail example, an Inbox can easily have that many. While a web app would display that 20-50 at a time, a desktop app would dump them all at once.

This has nothing to do with dumping or displaying. The web application like the desktop application will only render 50 records at a time, yet both need to fetch the full set (if only to show the number of results).

In that case, yeah, you'd want to use raw rows. I can see how EOF *on the desktop* wouldn't scale well.

Actually it scales much better on the desktop than on the web for obvious reasons.

Hmmm.... I think I begin to see why Apple rolled EOF into WebObjects and has been resistant to releasing it on its own. It's a decision I've decried in the past, but maybe they have a point.

The only point is marketing. OR mappers are traditionally a part of application servers and client/server apps are out.

*Chuckle* I like Java. Works great for web applications. Although Python's been tickling my fancy of late, to be sure. *sigh*... really must take another look at Zope. The last time I looked, there was no good documentation out there.

Well, Zope has pretty good documentation. I guess the issue is, that Zope doesn't fake to make hard problems easy ;-)

I can perfectly express that myself. SQL is really easy.
Which is why it can be well automated. :-) Seriously, I like it when my tools take care of the easy stuff. That's a personal choice, of course.

Of course. Thats automation because it can be automated which is pretty weird motiviation to do it.

Even with non-EOF apps, I tend to constraint-check in both the app and the database. The app has the ability to give more immediate feedback to the user, and I don't need to go grunging through the database errors to find out what failed. But the database needs to have the constraints as the final line of defense.

No, that isn't the point. Constraints are for ensuring data model consistency. What you do in advance is checking application level consistency, eg you are doing this because you don't want to tell the user "relation xyz still contains abc", but to say "hey guy, you have not yet disconnected task a from person b, should I delete, abort or disconnect?". Thats something entirely different.

No this is very wrong. I love the architecture and implementation of EOF. Its the best thing I know for what it does (OR mapping). But unfortunately this isn't something which is really useful in practice ;-)
Well, I think it depends on context. I'll concede the point for desktop apps.

;-) Well, EOF is actually somewhat reasonable for desktop apps because you are not hit that much by the memory requirements (which is certainly half of the problem, with select/join performance being the other).

regards,
  Helge
--
OpenGroupware.org
http://www.opengroupware.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]