|
|
Creating Smart Document SolutionsPart I - Schemas
One of the classic examples for explaining structured markup is a recipe. An XML structure is much like a set of Russian nesting dolls with the addition of siblings as well as children. Our biggest box (click Illustration) holds the entire recipe. It's broken down into some smaller boxes that contain introductory information, followed by an equipment list, an ingredients list, the cooking directions, and nutritional information. Each of those containers is then broken into smaller components. In the case of the ingredients grouping, you have each individual ingredient listing; the cooking directions container is broken down into each individual step. Continuing to deconstruct the recipe, each ingredient listing is made up of several pieces: a quantity, a type, an ingredient name, and possibly additional qualifiers. For instance, 1 cup onions, chopped. Each step would have a link to the ingredient(s) involved, any particular equipment required, and a time. Why bother? Think of what you could do with a recipe collection that is marked up in this way. First, it would be easy to combine various recipes to create a shopping list. Recipes could be merged in relation to preparation and cooking time, giving one an overall set of directions for a complete meal. You could locate recipes based on what you have on hand and the amount of time you have available. You could publish your collection in a variety of ways: by type (breakfast, appetizer, main course, dessert), by food group (chocolate, meat, poultry, vegetable), or by complete meals (combinations of appetizers, soups, salads, main courses, and desserts). In fact, there is such a DTD in the public domain; it's called RecipeML. I've created a W3C schema version which can be seen here. But you're probably not going to create a Smart Document solution for the authoring of recipes; instead you're going to need to take a look at the documents relevant to your particular application.
Contrary to popular belief, you don't have to be an XML expert to design a document architecture. What's most important is that you understand the actual information—what it is, where it comes from, how it's used, and how it relates to the other components. That information needs to be combined with the goals and objectives of the project. Will you be automatically populating information from a database or a web service? Accessing document fragments from a SharePoint site? Using the same information set for multiple deliverables, such as marketing materials, sales proposals, and user guides, or student and instructor versions of training materials, including PowerPoint slides? For instance, if your project is the development of a Smart Document solution for the creation of learning assessment materials (also known as tests), you'll want to identify each individual question so that they can be mixed and matched. You may find that you have a multiple-part question; that is, they contain some common information that several distinct questions refer to. Those will need to be associated in some way. And you'll need to be able know the correct answer without displaying it to the subject. Beyond that, you'll probably discover that you'll have some formatting characteristics that need to be accounted for. Here's where the ability to work with XML data islands can come in handy. If your raw XML dataset isn't going to be used by other applications, you can avoid some of the markup complexities typically inherent in a document-centric XML application. You can see excellent examples of this in several of the sample Smart Document solutions available in the SDK. For each piece of information that's deemed important enough to have markup associated with it, you'll need to determine three basic rules:
There are a few additional rules to be considered:
The easiest way to visually represent this information is in a tree diagram. It can be created in Visio, in an XML schema editor, or even within Visual Studio (although I don't recommend this option for document-centric schema development). You can see some tree diagrams of the RecipeML schema here. Each tool uses its own set of symbols to represent the rules defined above; this diagram was created in TurboXML. The important thing to note in the recipe diagrams is the model for step. Step is a child of directions, and can contain a number of different child elements, along with actual text. This is what's officially known as mixed content and is the main differentiation between document- or data-centric XML.
While it may seem that adopting an industry-standard schema could save on some development time, in the long run your project will suffer. Instead, take the time to do a thorough analysis of your information set. Once that's completed you can review what's publicly available and see if there's something that comes close to your particular requirements. Why bother adopting an existing model if you've already done everything but the actual coding? There may be other components available as well, such as stylesheets and transforms that will shorten your overall development schedule. Another advantage of creating your own schema is that you will be intimately familiar with its vocabulary and grammar since you were at least partially responsible for developing it. When it comes time to test, troubleshoot, or add enhancements, you'll have a much better understanding of the impact any changes will have on your overall application. Remember, your schema is the foundation of your application; if it needs to be changed, chances are most other components of your solution will also need to be modified.
There are four basic types of content models in XML Schema:
In DTD syntax, these differences are not explicitly stated, but inferred from the model itself. Similarly, XML Schema explicitly defines whether the children in a complex model must occur in sequence, or only the appropriate child elements can be chosen from the list. Some of the power of schemas is evident in the occurrence indicators; rather than the three choices available in DTDs (0-1, 0-many, 1-many), schemas allow the designer to enter a specific number or range. Similarly, schemas support datatyping, so additional validation can be performed to ensure that a date or phone number was entered in the proper format. The subject of schemas encompasses entire books; as mentioned previously much of the specification is targeted towards data-centric applications, adding functionality equivalent to what can be found in most relational databases. This makes perfect sense, since much of what XML is about is data interchange.
|
|
|||||||||
|
© 2004 Mary P McRae All Rights Reserved |
articles...lots of interesting articles ... we hope!
|
tutorials...the best way to learn is by doing ... we've put these tutorials together to help you learn all about Office 2003 and XML.
|
products...there's lots of great products that can help you with Office 2003 and XML. Here's our favorites!
|
book reviews...there's a lot of books out there ... how to choose? Let us help. Here's the scoop.
|
events...There's nothing quite like a geekfest! Check here often to get the scoop on the latest conferences, seminars, webinars, and workshops!
|
information...office-xml.com127 Old Revolutionary Rd. phone: 603.557.7985
|