writestruct does not reproduce the input of readstruct
    9 vues (au cours des 30 derniers jours)
  
       Afficher commentaires plus anciens
    
    Adrien Leygue
      
 le 16 Fév 2023
  
    
    
    
    
    Commenté : Jeremy Hughes
    
 le 21 Fév 2023
            I find it disturbing that applying writestruct on a structure created with readstruct does not reproduce the original input: see example below.
 Is there a way to fix this? Readstruct is very convenient for reading and modifying lighweight xml file (in my case xdmf file to be opened by Paraview) but this issue is in my opinion a major flaw.
Thanks for any advice.
Adrien.
>> type input.xml
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
    <Tag2 Name="foo">
        1 2 3
    </Tag2>
</Tag1>
>> h = readstruct("input.xml","structnodename","Tag1");
>> writestruct(h,"output.xml","structnodename","Tag1")
>> type output.xml
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
    <Tag2 Name="foo">
        <Text>1 2 3</Text>
    </Tag2>
</Tag1>
>> 
1 commentaire
  Stephen23
      
      
 le 16 Fév 2023
				
      Modifié(e) : Stephen23
      
      
 le 17 Fév 2023
  
			"writestruct does not reproduce the input of readstruct"
Nor is this expected: there is a large set of input files which will produce exactly the same structure once imported. The XML standard specifically states that a lot of file formatting (e.g. different whitespace) and "irrelevant" XML formatting (e.g. attribute order) is not signficant and should be considered equivalent. In your specific example, note that XML elements may contain text, attributes, other elements, or any mix of these:
Your "1 2 3" are themselves not elements or attributes, so must be text. MATLAB is semantically correct.
"...but this issue is in my opinion a major flaw."
Your proposal is impossible: in general there is no way to know which exact XML file generated a particular structure when imported into MATLAB (or any other application). The XML standard specifically states that this should not be possible.
This applies not only to READSTRUCT/WRITESTRUCT, but every other "pair" of import/export functions, e.g. READMATRIX accepts an uncountably large set of input files, which WRITEMATRIX cannot reproduce from the matrix alone. This is a necessary corollary of applying Postel's law:
Réponse acceptée
  Jeremy Hughes
    
 le 16 Fév 2023
        Unfortunately, there's no way to get that to happen 100% of the time, and readstruct/writestruct are not meant to do that. In fact, round tripping from a data source to any other representation and back again is seldom fully round-tripable unless the two systems were designed together to be that way.
In this case, in order to be able to do document transformation, you have to know far more than what a MATLAB struct is capable of storing. E.g. whether the data was in an attribute, or part of the text node as in your case, or in a node named "Text" to begin with.... or this thing:
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
    <Tag2 Name="foo">
        <Text>1</Text>
        <Text>2</Text>
        <Text>3</Text>
    </Tag2>
</Tag1>
To get the kind of fidelity for a round trip of XML across all the valid XML files, you necessarily lose some simplicity. Essentially, there's not a 1:1 mapping between MATLAB structs and XML. To get that and still have some sembalance of usability you have to move to objects.
Good(ish) news, there is a MATLAB XML DOM: https://www.mathworks.com/help/matlab/import_export/importing-xml-documents.html, and it (and similarly xmlread) will allow you to do anything, but at a cost of both a steeper learning curve, and more code complexity. Note that the XML DOM APIs are not designed by MathWorks; here's the spec: https://www.w3.org/TR/WD-DOM/ 
... but I don't recommend the read unless you're having trouble sleeping.
Basically, it's not a simple problem. XML is a very complex format, and working with it can be painful. readstruct/writestruct are really about getting some MATLAB data into that format and back again, and not the other way around, though readstruct should allow you to get data into MATLAB and work with it for general XML files.
3 commentaires
  Jeremy Hughes
    
 le 21 Fév 2023
				Thanks for the info Adrien,
I'll put in an enhancement request for an option to handle "Text" fields on writing.
Plus de réponses (0)
Voir également
Catégories
				En savoir plus sur Data Type Identification dans Help Center et File Exchange
			
	Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


