Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)

discuss-gnustep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)

From:	Helge Hess
Subject:	Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)
Date:	Wed, 29 Oct 2003 02:57:16 +0100

On 29.10.2003, at 01:45, Patrick Coskren wrote:

But in an RDBMS, many of the values are relations to other sets. Inother words, you have a bunch of rows linked in some graph.


No. You have sets.

I fail to see where rows vs. objects is a critical distinction, unlessyou're complaining about the fact that the list of values in an objectis static rather than dynamic.

Not sure. Yes, the value set of RDBMS queries is completely dynamic,thats true. And it is also true that the attribute set of an object isinherently static (at least for any concrete method)a) results sets of RDBMS selects are usually partial (thats why theyare called *selects*, not because you can specify the filter butbecause you can select which columns you want to have)

b) results sets of RDBMS often span rows

If that's what you're saying, then I can see the point, but in myexperience having an object model for your data makes the overallapplication far more maintainable.

Exactly, you got it! ;-) But it also breaks scalability andperformance. EOF is very nice if it would work out, but it doesn't.

In some cases I've had that object wrap the results of a Perl DBI call(let's say), but I've still had the object. I can see why peopleprefer a more dynamic approach.

Yes, it certainly does make sense to wrap result sets into own classes.But this doesn't really match EOF either.

EOF is really built around the principles that object identity isdefined by a primary key and that an entity maps to one or more tableswhich are not really allowed to map to other tables. Otherwise you willget trouble with the framework at various places (and if you reallyused EOF you know that very well ;-)

Only that this completely breaks identity and the relationships. Nowyou can also to some degree do subclassing in EOF, but this wouldeven break semantics.
I don't understand what you mean.

If you have a second artificial entity for a limited set of attributesyou immediatly get two different ObjC objects being mapped to a singleprimary key which will destroy a whole lot of assumptions of EOF. Oncommon sideeffect is that you run into cache conflicts.

As for breaking relationships, why couldn't your second entity definethe relationships it needs, same as the first entity?

So that you get toSubsetEntity and toEntity?? If you don't call this ahack ...

Yes, you have to redo some references, but the tools make it easy. Asfor identity, I don't see it being a problem as long as you never havemore than one object in the same EOEditingContext for a particulartable row.

If you have multiple overlapping entities keyed on a single primarykey, which you are suggesting as a "regular way to deal with theproblem", you will actually have multiple objects being keyed to aprimary key.

But, if I'm remembering correctly, that's a restriction even when youonly have one entity for a given table.
And why would subclassing "break semantics"? Because you're moreaccurately talking about a subsetting operation than a subclassingone?

Exactly. Eg my example was something like a"SomePersonAndSomeAddressProperties" entity. Inheriting the "Person"entity from this would be wrong, since a Person only has a relationshipto an Address, and the other way around is wrong either, since"SomePersonAndSomeAddressProperties" doesn't have all the propertiesrequired for a Person method.In short: RDBMS result sets do not match at all to the object classsystem.

Besides that, since something like a tableview is usually freelyconfigurable from the available attribute set you would need torecreate a dynamic entity all the time. Hack.
If you like. I personally do this manually (if I'm understanding you)even when I'm using raw database calls in a non-EOF context. I findit makes the code clearer.

Well, now I cannot follow you. I mean in a tableview a user can usuallyselect in some preferences which attributes are going to be shown. Andexactly those should be fetched.

This is a no brainer since a tableview maps 1:1 to a RDBMS result set.

Hm, to me the clearest code is always the code which shows what isactually happening. And both variants are one or two liners - the EOFvariant is not more "expressive" for set operations. Well, actually SQLis optimized for set operations and you

This does not only break interobject relationships, it also breaksinternal encapsulation of the objects. That is, you cannot reallyreuse/write methods anymore.
I'll give you this one. Subclassing might get around this, but Inever used it much.

Well, as lined out above, not really, since Subclassing is notappropriate for joins.

In 5.0, the support was still a little shaky, and that was the lasttime I used EOF.


EOF has a WHOLE lot of bugs which is due its internal complexity.

No. I mean usual amounts of rows for RDBMS (10000+ at least) withusual amounts of columns (I would say around 10+)
I think I'm following you: you're not worried about the size of thetable per se, but the size of the result set. Yes?

Yes. If you want to follow OO paradigms (and this is the point of EOF),you will only fetch complete objects (and their relationships) whichmap 1:1 to a primary key.

and of course with a result set being joined (which is the hardestpart for OO mapping).
It is? I always found that the easiest.purchaseOrder().customer().phoneNumber(). Joining 3 tables.


Only that this is not a usual query ;-)

But I'm still trying to figure out why we have such differentperceptions of EOF's scalability.


I think we get to this below ...

But again, I do *really* not expect to be able to convince you.

Well, you never know.

;-)

Sure, there are times when, for performance reasons, you want toload raw rows.
Hm, I wonder in which cases you would resort to that if EOF can dealquite fine with the issues I raised. Where is the performance barrierin your opinion? When did you need to resort to raw rows?
In thinking about this, I think I begin to see where you're comingfrom. When I used EOF, it was in the context of WebObjects.

Me too, don't know OpenGroupware.org? ;-). Actually the issues are much*less* in a desktop context since the large memory load of EOF isdistributed between multiple machines. Yet the advantages of a RDBMS donot play out in that context either.

In a web app, you'd rarely have a result set with 10,000 items,

When you are dealing with less than 10000 items you really don't need aRDBMS. How big are those items? You can store all of them in memory!And querying them using EOQualifier is very fast as well. You needdistribution? Use DO. Want something a bit more DB like? use BerkleyDB.

I do not install a RDBMS for 10000 rows.

BTW: thats part of the reason why Zope is an actual killer. It usesappropriate storage technology. ZODB can easily deal with 10000 objectsand a lot of web apps really do not need more than that.

because you couldn't reasonably display that much data on a web page.

You need to query that data nevertheless. This is especially easy withWebObjects which allows to build a scrollable table view and users wantthat.

A case where you might need to resort to raw rows is if you want yourapp to generate a report for the user to download, and you reallywould need to process a result set with that many items.

I don't know what you call a report, but I guess in your definitionalmost any regular query page in OGo would be a "report" (most notablythe scheduler views).

On the desktop, however, I can see where you would have 10,000 itemsin a result set easily. To use your mail example, an Inbox can easilyhave that many. While a web app would display that 20-50 at a time, adesktop app would dump them all at once.

This has nothing to do with dumping or displaying. The web applicationlike the desktop application will only render 50 records at a time, yetboth need to fetch the full set (if only to show the number ofresults).

In that case, yeah, you'd want to use raw rows. I can see how EOF *onthe desktop* wouldn't scale well.

Actually it scales much better on the desktop than on the web forobvious reasons.

Hmmm.... I think I begin to see why Apple rolled EOF into WebObjectsand has been resistant to releasing it on its own. It's a decisionI've decried in the past, but maybe they have a point.

The only point is marketing. OR mappers are traditionally a part ofapplication servers and client/server apps are out.

*Chuckle* I like Java. Works great for web applications. AlthoughPython's been tickling my fancy of late, to be sure. *sigh*... reallymust take another look at Zope. The last time I looked, there was nogood documentation out there.

Well, Zope has pretty good documentation. I guess the issue is, thatZope doesn't fake to make hard problems easy ;-)

I can perfectly express that myself. SQL is really easy.
Which is why it can be well automated. :-) Seriously, I like it whenmy tools take care of the easy stuff. That's a personal choice, ofcourse.

Of course. Thats automation because it can be automated which is prettyweird motiviation to do it.

Even with non-EOF apps, I tend to constraint-check in both the app andthe database. The app has the ability to give more immediate feedbackto the user, and I don't need to go grunging through the databaseerrors to find out what failed. But the database needs to have theconstraints as the final line of defense.

No, that isn't the point. Constraints are for ensuring data modelconsistency. What you do in advance is checking application levelconsistency, eg you are doing this because you don't want to tell theuser "relation xyz still contains abc", but to say "hey guy, you havenot yet disconnected task a from person b, should I delete, abort ordisconnect?". Thats something entirely different.

No this is very wrong. I love the architecture and implementation ofEOF. Its the best thing I know for what it does (OR mapping). Butunfortunately this isn't something which is really useful in practice;-)
Well, I think it depends on context. I'll concede the point fordesktop apps.

;-) Well, EOF is actually somewhat reasonable for desktop apps becauseyou are not hit that much by the memory requirements (which iscertainly half of the problem, with select/join performance being theother).


regards,
  Helge
--
OpenGroupware.org
http://www.opengroupware.org/

[Prev in Thread]

Current Thread

[Next in Thread]

Re: GNUstep roadmap (was Re: [Suggestion] GNUstep-test for quality control), (continued)

Prev by Date: Re: NSNotification and NSRunLoop
Next by Date: Re: ANNOUNCE GWorkspace 0.6.1
Previous by thread: Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)
Next by thread: Re: GDL2/EOF scaling (was: Re: GNUstep roadmap)
Index(es):
- Date
- Thread