So, as talked about at the last users' conference, here's a teaser.
I have code that can parse SQL-like strings and files, and product EntityCondition objects. For example: == InputStream in = StringBufferInputStream( "CONDITION foo field1 = 'bar' OR field2 = 'baz' AND field3 " + "IN (1, 2, 3); CONDITION bar id = ?id;"; SqlScript sqlScript = new SqlScript(inputStream); SqlScriptVisitor visitor = MakeCondition(); SimpleNode node = sqlScript.NamedConditions(); Map namedConditions = (Map) node.jjtAccept(visitor, null); == That returns a map of name -> condition. There is support for named parameters, which make use of the freeze support I added last year, and a ThreadLocal, to convert the condition to a real instance. The above pseudo-code(which is very close to the real code I have) can cache the parsed EntityCondition object, and only needs to call freeze() if parameters are being expanded. Here's the api I'd like to see: == condition = EntityConditionUtil.getCondition(string); condition = EntityConditionUtil.getCondition(string, map); condition = EntityConditionUtil.get(url, name); condition = EntityConditionUtil.get(url, name, map); == The url forms could use UtilCache to store the parsed file. The variants that take a map will call freeze() to create a real condition. Here's what I have done. * jjtree/javacc grammar for conditions(actually, full sql, see below) * implementation of SqlScriptVisitor that produces EntityCondition objects * new EntityConditionValue classes: * NamedParameter: placeholder for ?name in condition strings * ConstantValue: NamedParameter replaces itself with this during freeze. All that's needed is a url->inputstream mapper(trivial), storage in UtilCache, and the above wrapping code in EntityConditionUtil. Ok, now on to the next big feature. The jjtree grammar I have is not just for conditions. It can do full sql. And by full script, I mean CREATE TABLE, SELECT, INSERT, etc. To test this system out, I wrote an xsl stylesheet, that converts entitymodel.xml files into sql. I then did it over all such files in ofbiz. I then wrote a visitor that walks the parsed tree, and produces standard ModelEntity/ModelViewEntity objects. With counting in place, I get identical entity/view counts as the standard ModelReader stuff. Initially, this may look like a bunch of busy work. Why should entitymodel be done in sql, instead of the current working form? Well, if you were only concerned with plain tables, and not views, you would be correct. However, views are much more powerful when you can make use of the full sql syntax. This includes nested subselects in from items, complex nested inner/outer joins, and mathematical operations everywhere. The existing ModelViewEntity and sql generating code in in SqlJdbcUtil do not support such advanced features. This system can also be a replacement for DynamicViewEntities, which are rather cumbersome to work with. You already need to understand sql to write a view, and you also need to understand how to use the DynamicViewEntities api. With this new system, you just need to know sql. This last viewpoint also applies to ModelViewEntity xml. Why bother learning a second xml language for defining views, when sql will do just fine? So now you may be wondering why I wrote all this. You may ask, isn't the current system just fine? Well, while the current system does work, and is fairly well tested, it has several shortcomings, which have now permeated themselves thruout the code base, and the ofbiz community. As already mentioned, existing view entities are limited in their features. Because of the limit syntax, java code gets written that ends up doing several round trips to the database(or the entity-cache stuff I wrote), and recombing data in memory. However, it's been my experience, that a properly written query, with proper indexing, can be extremely fast. Let me give you an example. When I wrote the entitycache stuff and submitted it over a year ago, it was due to realized performance issues with the existing condition/caching stuff in ofbiz. This was due to the job I was working on at the time having some rather large databases, and some complex data arrangements. The reverse-dependency cache clearing code was an attempt to solve this, by clearing views when a normal entity was updated. It mostly works, with some very small border conditions(some have been fixed since then). The views that I was using that drove this development, have 10 members, but only 8 tables. 2 of those tables are joined twice. Additionally, a feature I wrote, but never submitted, was the ability to restrict views with conditions in the xml. I never liked my implemetation, so never sent it. After I got the cache clearing stuff done, and before I sent it to ofbiz, I realized that the complex view I had done was not performing well in the database. This was due to my initial lack of proper indexing. The existing deployed site now doesn't even cache that query *at all*, and the page it's one is the one of the main pages for the site(gives users a transcript of all certification tests they have taken, grouped into categories, with red/yellow/green lights signifying whether they have passed, or their cert is about to expire). It was at this point that I decided to trust the database, and let it do all the work. As much as I could, I tried to create a view, and not have any loop code at all. Once you have to do multiple queries in a loop(even with caching), performance falls about. So, now jump forward a year. The current projects we(brainfood) are working on are based on a new wiki system we've developed. One of the goals for this system is to simplify things as much as possible; to lower the barrier for entry when creating websites. Wiki markup is one of the features we implemented to support this end goal. Ean(my boss) wrote a very simple, but very easy to use, SQLUtil tool. Give it an ofbiz connection name(ie, "localpostgres"), a raw sql query string, and it will return you a list of maps, one map for each row in the database. Writing sql is much simpler than writing the xml stuff, and this tool let him change the sql easily at runtime. However, using this system requires one to know the entityname/fieldname mapping syntax, which is yet another thing to learn. It also had no caching whatsoever. One of the tools we integrated for this wiki system, is velocity. This is simpler(main design goal) than freemarker, and faster(freemarker copies object graphs when placing them into the template context). It uses jjtree to parse it's templates, and uses the AST(abstract syntax tree) at runtime to evaluate everything. This allowed me to see a working jjtree/javacc setup, and helped me to learn that tool. So, now that we(brainfood) have a need for a sql parser(to speed up SQLUtil, and to make things easier to work with), and that I know how to use jjtree/javacc, this feature can actually be implemented. Ok, enough rambling. Let me quickly explain the attachments. SqlScript.jtt: This is the sql grammar file. Fairly standard set of options needed to parse that into the SqlScript.jj file, then produce the SqlScript.java file from that. There are 2 top-level entry points: SQLScript, and NamedConditions. Both expect to parse a full file. The former will handle CREATE TABLE, CREATE VIEW, and CREATE INDEX. The latter handles the CONDITION name <condition> stuff. Both output an AST of the parsed data. The AST produced is meanted to be accessed by using an implementation of SqlScriptVisitor. I have versions for parsing CREATE TABLE, CREATE VIEW, and CONDITION trees; however, they require some minor changes to existing ofbiz code, so are not yet included here. entitymodel.xslt: XSL stylesheet that can convert entitymodel.xml files into sql files. Used to automatically convert everything, for testing. model.sql.gz: Example file containing all existing ofbiz entity models converted to sql. My development plan for integrating this, is to not replace anything, at first. I want this new code to exist alongside the existing sql/entitymodel/ModelEntity stuff. This way, no unknown problems occur. I'll be modifying ModelReader to be an interface, then having different implementations, one for the xml, and one for the sql forms. However, even at this step, no new features will exist. I'll then make a third ModelReader, that will not restrict what you can do for CREATE VIEW. This will replace the existing ModelViewEntity, and SqlJdbcUtil. It'll most likely just keep the post-processed AST around at runtime, for generating the actual sql. I don't have a time table for completly any of these steps. I can, however, probably get the condition parsing stuff integrated fairly quick, as that is non-invasive, and non-disrupting. ps: sorry for the length, but it's hard to shrink this down. |
Adam,
This was from a while ago, but I'm curious where you are on this. I initially had reservations about another set of technologies in the OFBiz framework, but I think your solution could really help with "business intelligence" or reporting tasks. I would really like to be able to write complex queries against the OFBiz entity and field names (ie, not the underlying database ones) and then have them execute through the entity engine, while still preserving true database independence. So, please let me know if you're still interested in making this available and whether you'd like to get some help on it. Si PS Here are the links to the original messages, for anybody who's interested in reading about them again: http://lists.ofbiz.org/pipermail/dev/2005-September/009122.html http://lists.ofbiz.org/pipermail/dev/2005-September/009173.html Adam Heath wrote: >So, as talked about at the last users' conference, here's a teaser. > >I have code that can parse SQL-like strings and files, and product >EntityCondition objects. For example: > >== >InputStream in = StringBufferInputStream( >"CONDITION foo field1 = 'bar' OR field2 = 'baz' AND field3 " + >"IN (1, 2, 3); CONDITION bar id = ?id;"; >SqlScript sqlScript = new SqlScript(inputStream); >SqlScriptVisitor visitor = MakeCondition(); >SimpleNode node = sqlScript.NamedConditions(); >Map namedConditions = (Map) node.jjtAccept(visitor, null); >== > >That returns a map of name -> condition. There is support for named >parameters, which make use of the freeze support I added last year, and a >ThreadLocal, to convert the condition to a real instance. > >The above pseudo-code(which is very close to the real code I have) can cache >the parsed EntityCondition object, and only needs to call freeze() if >parameters are being expanded. > > > > _______________________________________________ Dev mailing list [hidden email] http://lists.ofbiz.org/mailman/listinfo/dev |
On Wed, 11 Jan 2006, Si Chen wrote:
> Adam, > > This was from a while ago, but I'm curious where you are on this. > > I initially had reservations about another set of technologies in the > OFBiz framework, but I think your solution could really help with > "business intelligence" or reporting tasks. I would really like to be > able to write complex queries against the OFBiz entity and field names > (ie, not the underlying database ones) and then have them execute > through the entity engine, while still preserving true database > independence. > > So, please let me know if you're still interested in making this > available and whether you'd like to get some help on it. Well, it can't do joins. Without that feature, it's mostly useless. I am using svk locally, and have this checked in. I could merge this to you guys, easy enough, that way(svk push, after I merge the newer upstream into my work branch). It's fairly contained, and the existing code doesn't change any existing functionality. It's all new stuff. Using the sql as a macro expander, by making use of the meta data in ModelEntity and ModelViewEntity, is not something I have attempted yet. It'd require multiple passes over the AST(Abstract Syntax Tree). I do have code that can convert *all* entitymodel.xml files into compatible CREATE TABLE/CREATE VIEW stmts, then code that can parse those sql stmts into AST. I only did this because it seemed like a good way to verify my code was handling everything. I don't have anything that handles SELECT/INSERT/UPDATE/DELETE. The AST parts are there, but I've not started any visitor stuff. Lately, we have been skipping ofbiz entirely, and doing direct database queries, that return a list of maps. This is way faster than doing loop code in java, then combining as needed. For example, here's a query that we have setup. It's a velocity template, which explains the $var and #directive stuff. It's just not possible to do anything like this with ofbiz views. It's also something I want to be able to support with my sql stuff, if I ever get around to it. == SELECT per.party_id AS "partyId", per.first_name AS "firstName", per.last_name AS "lastName", a.tech_id AS "technicianId", a.email_address AS "emailAddress", a.city AS "city", a.state_province_name AS "state", a.address_id AS "addressId" FROM person per JOIN ( SELECT a.party_id, a.tech_id, b.email_address, c.city, c.state_province_name, c.contact_mech_id AS address_id FROM ( SELECT a.contact_mech_id, b.party_id, a.info_string AS tech_id FROM contact_mech a JOIN party_contact_mech b ON a.contact_mech_id = b.contact_mech_id JOIN party_contact_mech_purpose c ON b.contact_mech_id = c.contact_mech_id WHERE (b.from_date IS NULL OR b.from_date <= current_timestamp) AND (b.thru_date IS NULL OR b.thru_date > current_timestamp) AND (c.from_date IS NULL OR c.from_date <= current_timestamp) AND (c.thru_date IS NULL OR c.thru_date > current_timestamp) AND a.contact_mech_type_id = 'ECOUNT_PUID' AND c.contact_mech_purpose_type_id = 'ECOUNT_TECH_ID' ) a JOIN ( SELECT a.contact_mech_id, b.party_id, a.info_string AS email_address FROM contact_mech a JOIN party_contact_mech b ON a.contact_mech_id = b.contact_mech_id JOIN party_contact_mech_purpose c ON b.contact_mech_id = c.contact_mech_id WHERE (b.from_date IS NULL OR b.from_date <= current_timestamp) AND (b.thru_date IS NULL OR b.thru_date > current_timestamp) AND (c.from_date IS NULL OR c.from_date <= current_timestamp) AND (c.thru_date IS NULL OR c.thru_date > current_timestamp) AND a.contact_mech_type_id = 'EMAIL_ADDRESS' AND c.contact_mech_purpose_type_id = 'PRIMARY_EMAIL' ) b ON a.party_id = b.party_id JOIN ( SELECT a.contact_mech_id, b.party_id, d.city, e.geo_id AS state_province_id, e.geo_name AS state_province_name FROM contact_mech a JOIN party_contact_mech b ON a.contact_mech_id = b.contact_mech_id JOIN party_contact_mech_purpose c ON b.contact_mech_id = c.contact_mech_id JOIN postal_address d ON c.contact_mech_id = d.contact_mech_id JOIN geo e ON d.state_province_geo_id = e.geo_id WHERE (b.from_date IS NULL OR b.from_date <= current_timestamp) AND (b.thru_date IS NULL OR b.thru_date > current_timestamp) AND (c.from_date IS NULL OR c.from_date <= current_timestamp) AND (c.thru_date IS NULL OR c.thru_date > current_timestamp) AND a.contact_mech_type_id = 'POSTAL_ADDRESS' AND c.contact_mech_purpose_type_id = 'HOME_ADDRESS' ) c ON a.party_id = c.party_id ) a ON a.party_id = per.party_id JOIN party_role pr ON per.party_id = pr.party_id AND pr.role_type_id = 'REWARD_PROGRAM_MEMBR' #if ($technicianId || $emailAddress || $firstName || $lastName) WHERE TRUE #if ($technicianId) AND cm_t.info_string = ? #end #if ($emailAddress) AND LOWER(cm_e.info_string) LIKE LOWER(?) #end #if ($firstName) AND LOWER(per.first_name) LIKE LOWER(?) #end #if ($lastName) AND LOWER(per.last_name) LIKE LOWER(?) #end #end #if ($orderBy == "technicianId") ORDER BY cm_t.info_string $orderByDirection #elseif ($orderBy == "emailAddress") ORDER BY cm_e.info_string $orderByDirection #elseif ($orderBy == "firstName") ORDER BY per.first_name $orderByDirection #elseif ($orderBy == "lastName") ORDER BY per.last_name $orderByDirection #end OFFSET $offset LIMIT $limit + 1 == _______________________________________________ Dev mailing list [hidden email] http://lists.ofbiz.org/mailman/listinfo/dev |
Adam,
Would you be able to set up your SVK so we can take a look at it and possibly play with it? Maybe we can look through your code and your examples? Thanks, Si Adam Heath wrote: >On Wed, 11 Jan 2006, Si Chen wrote: > > > >>Adam, >> >>This was from a while ago, but I'm curious where you are on this. >> >>I initially had reservations about another set of technologies in the >>OFBiz framework, but I think your solution could really help with >>"business intelligence" or reporting tasks. I would really like to be >>able to write complex queries against the OFBiz entity and field names >>(ie, not the underlying database ones) and then have them execute >>through the entity engine, while still preserving true database >>independence. >> >>So, please let me know if you're still interested in making this >>available and whether you'd like to get some help on it. >> >> > >Well, it can't do joins. Without that feature, it's mostly useless. > >I am using svk locally, and have this checked in. I could merge this to you >guys, easy enough, that way(svk push, after I merge the newer upstream into my >work branch). It's fairly contained, and the existing code doesn't change any >existing functionality. It's all new stuff. > >Using the sql as a macro expander, by making use of the meta data in >ModelEntity and ModelViewEntity, is not something I have attempted yet. It'd >require multiple passes over the AST(Abstract Syntax Tree). > >I do have code that can convert *all* entitymodel.xml files into compatible >CREATE TABLE/CREATE VIEW stmts, then code that can parse those sql stmts into >AST. I only did this because it seemed like a good way to verify my code was >handling everything. > >I don't have anything that handles SELECT/INSERT/UPDATE/DELETE. The AST parts >are there, but I've not started any visitor stuff. > >Lately, we have been skipping ofbiz entirely, and doing direct database >queries, that return a list of maps. This is way faster than doing loop code >in java, then combining as needed. > >For example, here's a query that we have setup. It's a velocity template, >which explains the $var and #directive stuff. It's just not possible to do >anything like this with ofbiz views. It's also something I want to be able to >support with my sql stuff, if I ever get around to it. > >== >SELECT > per.party_id AS "partyId", > per.first_name AS "firstName", > per.last_name AS "lastName", > a.tech_id AS "technicianId", > a.email_address AS "emailAddress", > a.city AS "city", > a.state_province_name AS "state", > a.address_id AS "addressId" >FROM > person per JOIN ( > SELECT > a.party_id, > a.tech_id, > b.email_address, > c.city, > c.state_province_name, > c.contact_mech_id AS address_id > FROM > ( > SELECT > a.contact_mech_id, > b.party_id, > a.info_string AS tech_id > FROM > contact_mech a JOIN party_contact_mech b ON > a.contact_mech_id = b.contact_mech_id > JOIN party_contact_mech_purpose c ON > b.contact_mech_id = c.contact_mech_id > WHERE > (b.from_date IS NULL OR b.from_date <= current_timestamp) > AND > (b.thru_date IS NULL OR b.thru_date > current_timestamp) > AND > (c.from_date IS NULL OR c.from_date <= current_timestamp) > AND > (c.thru_date IS NULL OR c.thru_date > current_timestamp) > AND > a.contact_mech_type_id = 'ECOUNT_PUID' > AND > c.contact_mech_purpose_type_id = 'ECOUNT_TECH_ID' > ) a JOIN ( > SELECT > a.contact_mech_id, > b.party_id, > a.info_string AS email_address > FROM > contact_mech a JOIN party_contact_mech b ON > a.contact_mech_id = b.contact_mech_id > JOIN party_contact_mech_purpose c ON > b.contact_mech_id = c.contact_mech_id > WHERE > (b.from_date IS NULL OR b.from_date <= current_timestamp) > AND > (b.thru_date IS NULL OR b.thru_date > current_timestamp) > AND > (c.from_date IS NULL OR c.from_date <= current_timestamp) > AND > (c.thru_date IS NULL OR c.thru_date > current_timestamp) > AND > a.contact_mech_type_id = 'EMAIL_ADDRESS' > AND > c.contact_mech_purpose_type_id = 'PRIMARY_EMAIL' > ) b ON > a.party_id = b.party_id > JOIN ( > SELECT > a.contact_mech_id, > b.party_id, > d.city, > e.geo_id AS state_province_id, > e.geo_name AS state_province_name > FROM > contact_mech a JOIN party_contact_mech b ON > a.contact_mech_id = b.contact_mech_id > JOIN party_contact_mech_purpose c ON > b.contact_mech_id = c.contact_mech_id > JOIN postal_address d ON > c.contact_mech_id = d.contact_mech_id > JOIN geo e ON > d.state_province_geo_id = e.geo_id > WHERE > (b.from_date IS NULL OR b.from_date <= current_timestamp) > AND > (b.thru_date IS NULL OR b.thru_date > current_timestamp) > AND > (c.from_date IS NULL OR c.from_date <= current_timestamp) > AND > (c.thru_date IS NULL OR c.thru_date > current_timestamp) > AND > a.contact_mech_type_id = 'POSTAL_ADDRESS' > AND > c.contact_mech_purpose_type_id = 'HOME_ADDRESS' > ) c ON > a.party_id = c.party_id > ) a ON > a.party_id = per.party_id > JOIN party_role pr ON > per.party_id = pr.party_id > AND > pr.role_type_id = 'REWARD_PROGRAM_MEMBR' > >#if ($technicianId || $emailAddress || $firstName || $lastName) >WHERE > TRUE >#if ($technicianId) > AND cm_t.info_string = ? >#end >#if ($emailAddress) > AND LOWER(cm_e.info_string) LIKE LOWER(?) >#end >#if ($firstName) > AND LOWER(per.first_name) LIKE LOWER(?) >#end >#if ($lastName) > AND LOWER(per.last_name) LIKE LOWER(?) >#end >#end >#if ($orderBy == "technicianId") >ORDER BY > cm_t.info_string $orderByDirection >#elseif ($orderBy == "emailAddress") >ORDER BY > cm_e.info_string $orderByDirection >#elseif ($orderBy == "firstName") >ORDER BY > per.first_name $orderByDirection >#elseif ($orderBy == "lastName") >ORDER BY > per.last_name $orderByDirection >#end >OFFSET > $offset >LIMIT > $limit + 1 >== > >_______________________________________________ >Dev mailing list >[hidden email] >http://lists.ofbiz.org/mailman/listinfo/dev > > > _______________________________________________ Dev mailing list [hidden email] http://lists.ofbiz.org/mailman/listinfo/dev |
Free forum by Nabble | Edit this page |