Home
Home Page
And again about functional programming on Python
Where the Python has come crawling?
Practical application OOP in PHP5
Functional programming in language Python
Approaches of language Python - an amusing example of optimization
Use ext/mysqli: Part I - the Review and the prepared expressions
How to receive the maximal result from registration in catalogues
XML-RPC In language Python
The API-specification of databases of language Python, version 2.0
Programming of the Web-client in language Python
Useful advice for job with XML
Future Web - behind semantics
Whether it is necessary to cancel spaces of names XML?
Use AJAX in ASP.NET
ASP .NET 2.0: Reference pages
Patterns of registration
What is ASP.NET. Installation and the test project.
Anatomy ASP.NET. ASP.NET in operation.
Server elements of management Continuation.
Links
 
topic


Useful advice for job with XML

How to not look incompetent in area XML


KHenri Sivonen (Henri Sivonen) has written rather useful clause{article} - " How to avoid accusations of incompetence at job with XML " (HOWTO Avoid Being Called a Bozo When Producing XML) (see undressed Resources). In her he discusses, as it is correct to work with XML, using spaces of names and opportunities of formats of loading of the Internet on basis XML - RSS and Atom. In introduction to clause{article} he marks, that there are developers who consider that is very difficult or even it is practically impossible to achieve a correctness at creation of documents XML by programming. But at the same time there are developers who easily cope with this problem{task} and cannot understand, why others are so incompetent. Nobody wants to count itself incompetent. The advice{councils} resulted below, will help to avoid this unpleasant sensation.


First advice{council} Henri Sivonena consist that it is not necessary to consider{examine} XML as a text format. The author of present{true} clause{article} believes, what is it rather dangerous advice{council}. His{its} basic idea is correct: at creation or editing of document XML it is necessary to be more cautious, than at job with the usual text document. But it concerns to all text formats of any structure. However the statement, that document XML is not the text, is a denying one of the base characteristics XML, designated in definition XML which is given in his{its} specification (" Text object is correctly made out document XML [if he corresponds{meets} to the present{true} specification] "). Besides it, in XML there is a technical definition of the text as sequences of the symbols interpretive as XML. The text is not simply the symbols limited to elements of a tree or attributes. Such structure technically is called as the symbolical data. The text - a basis of all suhhnostej XML, therefore the statement, that XML is not the text, is inconsistent. It is much more useful to emphasize specific features which distinguish XML from the text formats, already known to developers.


KHenri Sivonen, certainly, the rights, warning that is impossible to unite thoughtlessly all in one concept and to hope, that document XML will be made correctly out. At creation of documents XML it is better to use well developed complete sets of tools XML, instead of simple text tools (see clause{article} of the author in section Resources). The general{common} advice{council} can be such: to not use mechanisms if there is no confidence that they will lead to to creation of correctly made out document XML. One of approaches to safe creation of documents XML - transfer of events SAX1 with use for this tree, a stack or parsera XML. But thus it is necessary to remember, that tools SAX can not provide all necessary operations on check of a correctness. For example, in XML some symbols Unicode are not supposed. Additional checks can be necessary for revealing such moments.


Reasonable the offer that users should not operate spaces of names manually also looks. With spaces of names XML it is necessary to address very cautiously. Usually developers operate with universal names (space of a name of the universal index of a resource (Uniform Resource Identifier, abbr. URI) plus the local name (local name)), but sometimes him{it} is necessary to deal with prefixes or declarations XML. In the specifications similar XSLT (Extensible Stylesheet Transformation Language - expanded language of transformation of tables of styles), inside values of attributes can be used class QName (a combination of a prefix and a local name). Thus it is supposed, that the prefix is interpreted according to internal declarations of spaces of names. Such variant of use is called as contextual operator QName (QName in context). In this case the developer should supervise the declared prefix, otherwise resulting processing XML will not be carried out. But when developers really completely operate own declarations of spaces of names the result often appears unpredictable because of complexity of spaces of names XML.


One of ways of updating of syntax of spaces of names which can be broken during processing XML, is an insert of a so-called initial step (canonicalization step), i.e. the step independent of final realization, at the end of processing. Canonization XML excludes that syntactic variability which is authorized XML 1.0 and spaces of names XML, including various features of declarations of spaces of names. But thus it is necessary to remember, that canonization cannot exclude absolutely all problems which do{make} declarations of spaces of names unreliable for developers. Canonization does not help in the decision of the questions connected with QNames in contextual problems as she does not change the prefixes used in the document. But she nevertheless essentially reduces the disorder of declarations of spaces of names - up to such degree when the developer can easily distinguish problems or even to write the program for their automatic elimination. Library GenX automatically generates the initial document XML, many other complete sets of tools give opportunities of canonization as an option.


In opinion of the author, advice{council} Henri Sivonena concerning inadmissibility of inclusion of blanks of structural listing the program in the symbolical data is a little bit tense. In opinion KHenri Sivonena if document XML is submitted as, shown in listing 1, as a rule, his{its} performance as, ostentatious in listing 2, is not safe.


Listing 1. Example XML



<foo> bar </foo>


Listing 2. Example XML with the blanks added to the symbolical data



<foo>

bar

</foo>


But if structurally to unpack{print out} document XML submitted in listing 3 this operation will be safe (listing 4).


Listing 3. One more example XML



<doc> <foo> bar </foo> </doc>


Listing 4. Example XML from listing 3 with the blanks added to the symbolical data



<doc>

<foo> bar </foo>

</doc>


Many tools of serialization distinguish this difference between rather safe and concerning unsafe structural listing. It is important to understand, that the forms of structural listings shown in listings 3 and 4, can cause distortions if blanks are added to the mixed maintenance{contents}. These problems can be avoided, if transformation will cope the circuit. But in practice the majority of the dictionaries using the mixed maintenance{contents}, are not so sensitive to normalization of blanks, therefore it is not necessary to give a lot of attention to structural listing. It is necessary to remember this potential problem and to be confident, that there is an opportunity of disconnect of structural listing (preferably that this option has been by default switched - off). KHenri Sivonen recommends the type of structural listing submitted in listing 5, but the author of present{true} clause{article} disagrees with it{him} as such marking look badly and is inconvenient for job.


Listing 5. The structural listing offered{suggested} KHenri Sivonenom, but not supported by the author of present{true} clause{article}



<foo> bar </foo>


Some more advice that who works with XML


XML it is arranged rather simply, therefore his{its} use for job with too complex{difficult} structures is not optimum. Detailed enough material Simona Senlorena (Simon St. Laurent) " "Monastic" XML " (Monastic XML) (see undressed Resources) is devoted to these problems. The author of this material discusses a fundamental role of the symbolical data and marking (elements and attributes). Besides he explains, why the patrimonial identifier (generic identifier), also named a name such as an element, is the important concept and how he can be made by a unique basic key element of structure of the marked information. In a reality at use of spaces of names XML by the basic key element the universal name (space of name URI plus a local name) is. Such complexity is one of the reasons on which Senloren calls for care in use of spaces of names. One more problem XML is a job with trees. Though at first sight it seems, that hierarchical structure XML can be easily distributed on graphic structures, in practice modelling grafov in XML appears uneasy business. At last, one more important advice{council} concerns to area of optimization of a marking for processing documents XML. XML is a declarative technology, and the given fact makes both his{its} main force, and a source of disappointments for many developers. Those developers who tries to approach too design XML to details of processing, finally find out, that processing becomes more complex{difficult}. The key to successful job with XML is a reference{manipulation} of the basic attention to essence of the information which should be submitted in an abstract kind. Thus it is necessary to separate from technical design of systems which will be engaged in processing of this information.



The conclusion


At the analysis of the best practices in area XML of opinion will always differ, it is especial at modern, early stages of development of this language, but it is the positive moment. Besides listed above, there are also other vital topics for discussion, therefore it is not necessary to stop on achieved.