Thanks Stefan. All interesting points.

Can you build bitmap indices against a specific property in a document container?   Do we have any data on the performance of large document datasets (using indices) as compared to large persistent tables of Caché objects?

In terms of storing complex values in a property - in actuality would there be any additional support for handing that same complex property as a piece of a document?  It seems to me that you need to do all of your own handling code in all cases if you use documents.  I guess an advantage would be the ability to subindex pieces of a document (I assune that is supported or soon will be?)

Thanks for the details on the documenting of the schema's.  I was guessing it was something along those lines - again sort of a throw-back to keeping separate global structure documents in legacy Caché.  Not a deal breaker but it takes a higher level of discipline in order to keep the code maintainable as it evolves over time.

I learned a lot from this thread - thanks!

Stefan,

Thanks for your thoughtful response.  A few comments and questions on your clarifications:

  • Flexibility
    • I am certain there are some more extreme situations where migration is necessary but the majority of my experience has been around data evolution with Caché Persistent Objects which is trivial to achieve (even renaming a field is straightforward without moving any data).  But if someone has to do a massive data migration moving data from one set of objects to a refactored set of objects then there is certainly some work involved.  
  • Sparseness
    • Your point about the $lb performance is interesting.  I did some quick tests and it looks like looping over a $lb structure 10M times to add 2 elements in the 1st and 2nd positions sees a 40% increase when accessing the 1st and 20th position and a 270% increase when accessing the 1st and 60th position.  So there is something to be said about the hit for extremely sparse persistent objects that are being referenced many, many, many times in succession (even my 60th position test only took 1.9 sec to access the two elements 10M times).  In reality I am not sure about what type of data model would actually run into this type of consideration (that and the need to be concerned about the tiny waste of space for null placeholders in a sparse $listbuild).  But it is an interesting thing to think about
  • Hierarchical
    • Your statement about objections only being able to store an embedded object is not entirely correct.  One of the advantages of the Parent/Child relationship is that the children are colocated in the storage of the parent.  This has major performance implications (just like serial objects) and are a very powerful construct as a result.  The only places where you are not colocating the data are the one to many or if you have a linked class instance.  With a complex data model you would certainly have fewer reads with a document store.
  • Dynamic Types
    • Your example isn't quick accurate - since Caché stores everything as a string you can give yourself as much flexibility as you want by not using a constrained type when designing your object.  You could use name as a %String in a persistent Caché object and then store either the value of "Stefan Wittmann" or the value of "{'first':'Stefan','last':'Wittmann','middle':null}" :)  You have full flexibility if you choose to have it - or you can use typing to leverage the build-in validation.  

From my perspective, while there is certainly increased flexibility to be gained with documents, that comes at the pricing of having to write more validation and processing code.  In addition, Caché Persistent Objects make it very easy to have your schema and structure be self-documenting (and the class definition can be the 'source of truth').  With Document, which in many respects is a move back in the direction of 'roll your own global structures' the developer would be on the hook for creating external documentation on the structure and field uses of the documents that are stored in the container.  Certainly picking good property names is a good step in the right direction, but that doesn't get you are far as you can get with class and property comments in a persistent class definition.  How do you envision people documenting their document schema?  

Stefan,

This is a great article and an excellent resource to help people come up to speed quickly on this new feature - thank you.

I do have one question / comment however.  You listed 4 benefits to using the Document approach, and I certainly see all of these as benefits over Relational DBs, but it appears to me that Caché Objects have the exact same benefits for points 2, 3 and 4, and it is so trivial to update Caché object schema's that I don't really see 1 as being very convincing to a Caché Object developer.  

I am assuming these benefits are more targeted to people looking to switch from relational DBs?  Or do you see a 'killer app' type possibilities for Caché-based shops that already make extensive use of Caché Objects (and so therefore already have easy schema updates, sparse storage, etc).

Thanks!

Ben

Brian,

For years this was the official InterSystems-sponsored site for this type of thing:

http://skills.intersystems.com/

However, it looks rather empty at the moment (I think things expire after 6 months or something like that).  It wouldn't hurt to post your position there - perhaps with increased advertising on the Developer Community this will pick up in its usage again.

HTH,

Ben 

I don't have a complete answer for you, but presumably you could grab the binary stream via SQL and then save the stream do disk via the %Stream.FileBinary class interface.  We manage JPG files through that interface, but I haven't done the fetch via SQL before (so hopefully someone who has fetched a jpg via SQL could comment on that piece)

Mike - thanks for posting this for everyone's benefit!  One minor point though - is this a Question (as you created it) or an Article?  Is there anything  you are asking specifically? (I don't know that there is a way for you to change it to an article at this point but you should note the difference for next time).

I do want to let everyone know that they can rely on the link https://download.InterSystems.com  always pointing to our evaluation kit download application (even if the rest of the website is rearranged).  That is an application managed by my team and it intended to always be accessible through that entry point.  

Sankar,

For a version of Caché this old (circa 2002) I strongly suggest that you contact the WRC directly for assistance (http://www.intersystems.com/services-support/product-support/).  That will be your best bet to getting your system back online as quickly as possible.

Please feel free to report back here as to the root cause and resolution for what you're seeing.

Ben

Mike,

I am not sure about the "access denied" message you are seeing - perhaps Paul could comment on that.  However, to manage all of your subscriptions:

1) Log in

2) Click on My Account (top icon bar)

3) Click on the Subscriptions tab

Then I strongly recommend that you click "Settings" and Uncheck "Digest Mode".  The Digest emails are really not very usable (IMHO) and work has been done to make the individual emails more useful - you may find those to be better and closer to what you are used to in the Google Groups (assuming you didn't use Digest in Google Groups).  

Under the "Page / Threads" tab you can unsubscribe to the threads you are no longer interested in.  

I hope that helps.  The email update mechanism has come a long way but it still has quite a ways to go.  There is work that will be commencing shortly to enable HTML based email which should be an improvement as well.