Avatar
Tayson
Člen
Avatar
Tayson:

Zdravim
Mam subor formatu OAI-PMH co je nieco podobne ako xml ale neni to xml pretoze mi to nechce zobrat ziaden xml convertor. Potrebujem aby tento subor bol formatu csv alebo existuje nejaka transformacia XSLT ktora transformuje xml jeden na xml druhy lenze ani to mi nefunguje. Viete mi nejako poradit ako dalej ? .... Ukazka zo suboru vyzera asi takto

D:\Harvest\harvester2-0.1.12>java -classpath .;log4j-1.2.12.jar;harvester2.jar;xalan.jar ORG.oclc.oai.harvester2.app.RawWrite http://citeseerx.ist.psu.edu/oai2
Fri Mar 04 14:13:14 CET 2011
<?xml version="1.0" encoding="UTF-8"?>
<harvest>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2011-03-04T13:13:01+00:00</responseDate>
  <request verb="Identify">http://citeseerx.ist.psu.edu/oai2</request>
  <Identify>
    <repositoryName>"CiteSeerX Scientific Literature Digital Library and Search Engine"</repositoryName>
    <baseURL>http://citeseerx.ist.psu.edu/oai2</baseURL>
    <protocolVersion>2.0</protocolVersion>
    <adminEmail>csx-system@ist.psu.edu</adminEmail>
    <earliestDatestamp>1970-01-01</earliestDatestamp>
    <deletedRecord>no</deletedRecord>
    <granularity>YYYY-MM-DD</granularity>
    <compression>gzip</compression>
    <compression>zip</compression>
    <compression>deflate</compression>
    <description>
      <oai-identifier xmlns="http://www.openarchives.org/OAI/2.0/oai-identifier" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai-identifier http://www.openarchives.org/OAI/2.0/oai-identifier.xsd">
        <scheme>oai</scheme>
        <repositoryIdentifier>CiteSeerXPSU</repositoryIdentifier>
        <delimiter>:</delimiter>
        <sampleIdentifier>oai:CiteSeerXPSU:10.1.1.1.1867</sampleIdentifier>
      </oai-identifier>
    </description>
    <description>
      <eprints xmlns="http://www.openarchives.org/OAI/1.1/eprints" xsi:schemaLocation="http://www.openarchives.org/OAI/1.1/eprints http://www.openarchives.org/OAI/1.1/eprints.xsd">
        <content>
          <text>Computer and Information Science Publications collected by CiteSeerX.PSU</text>
        </content>
        <metadataPolicy>
          <text>Full texts are individually tagged and the rights statements must be adhered to</text>
        </metadataPolicy>
        <dataPolicy>
          <text>Full texts are individually tagged and the rights statements must be adhered to</text>
        </dataPolicy>
      </eprints>
    </description>
  </Identify>
</OAI-PMH>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2011-03-04T13:13:02+00:00</responseDate>
  <request verb="ListMetadataFormats">http://citeseerx.ist.psu.edu/oai2</request>
  <ListMetadataFormats>
    <metadataFormat>
      <metadataPrefix>oai_dc</metadataPrefix>
      <schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema>
      <metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNamespace>
    </metadataFormat>
  </ListMetadataFormats>
</OAI-PMH>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2011-03-04T13:13:02+00:00</responseDate>
  <request>http://citeseerx.ist.psu.edu/oai2</request>
  <error code="noSetHierarchy">This repository does not support sets</error>
</OAI-PMH>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2011-03-04T13:13:02+00:00</responseDate>
  <request metadataPrefix="oai_dc" verb="ListRecords">http://citeseerx.ist.psu.edu/oai2</request>
  <ListRecords>
    <record>
      <header>
        <identifier>oai:CiteSeerXPSU:10.1.1.1.1484</identifier>
        <datestamp>2009-05-24</datestamp>
      </header>
      <metadata>
        <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
          <dc:title>Winner-Take-All Network Utilising Pseudoinverse Reconstruction Subnets Demonstrates Robustness on the Handprinted Character Recognition Problem</dc:title>
          <dc:creator>J. Körmendy-rácz</dc:creator>
          <dc:creator>S. Szabó</dc:creator>
          <dc:creator>J. Lörincz</dc:creator>
          <dc:creator>G. Antal</dc:creator>
          <dc:creator>G. Kovács</dc:creator>
          <dc:creator>A. Lörincz</dc:creator>
          <dc:subject>Correspondence and offprint requests to</dc:subject>
          <dc:subject>J. Kormendy-Rácz</dc:subject>
          <dc:description>Wittmeyer’s pseudoinverse iterative algorithm is formulated&#13;
as a dynamic connectionist Data Compression and Reconstruction (DCR) network, and subnets of this type are supplemented by the winner-take-all paradigm. The winner is selected upon the goodness-of-fit of the input reconstruction. The network can be characterised as a competitive-cooperative-competitive architecture by virtue of the contrast enhancing properties of the pseudoinverse subnets. The network is capable of fast learning. The adopted learning method gives rise to increased sampling in the vicinity of dubious boundary regions that resembles the phenomenon of categorical perception. The generalising abilities of the scheme allow one to utilise single bit connection strengths. The network is robust against input noise and contrast levels, shows little sensitivity to imprecise connection strengths, and is promising for mixed VLSI implementation with on-chip learning properties. The features of the DCR network are demonstrated on the NIST database of handprinted characters.</dc:description>
          <dc:contributor>CiteSeerX</dc:contributor>
          <dc:publisher>Springer</dc:publisher>
          <dc:date>2009-05-24</dc:date>
          <dc:date>2007-11-19</dc:date>
          <dc:date>1999</dc:date>
          <dc:format>application/pdf</dc:format>
          <dc:type>text</dc:type>
          <dc:identifier>http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.1484</dc:identifier>
          <dc:source>http://people.inf.elte.hu/lorincz/Files/publications/WTA_NCA.pdf</dc:source>
          <dc:language>en</dc:language>
          <dc:relation>10.1.1.134.6077</dc:relation>
          <dc:relation>10.1.1.65.2144</dc:relation>
          <dc:relation>10.1.1.54.7277</dc:relation>
          <dc:relation>10.1.1.48.5282</dc:relation>
          <dc:rights>Metadata may be used without restrictions as long as the oai identifier remains attached to it.</dc:rights>
        </oai_dc:dc>
      </metadata>
    </record>
    <record>
      <header>
        <identifier>oai:CiteSeerXPSU:10.1.1.1.1485</identifier>
        <datestamp>2009-05-24</datestamp>
      </header>
      <metadata>
        <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
          <dc:title>DEEM: a Tool for the Dependability Modeling and Evaluation</dc:title>
          <dc:creator>A. Bondavalli</dc:creator>
          <dc:creator>I. Mura</dc:creator>
          <dc:creator>S. Chiaradonna</dc:creator>
          <dc:creator>S. Poli</dc:creator>
          <dc:creator>F. Sandrini</dc:creator>
          <dc:subject>Processes</dc:subject>
          <dc:description>Multiple-Phased Systems, whose operational life can be partitioned in a set of disjoint periods, called ¿phases¿; include several classes of systems such as Phased Mission Systems and Scheduled Maintenance Systems. Because of their deployment in critical applications, the dependability modeling and analysis of Multiple-Phased Systems is a task of primary relevance. However, the phased behavior makes the analysis of Multiple-Phased Systems extremely complex. This paper is centered on the description and application of DEEM, a dependability modeling and evaluation tool for Multiple Phased Systems. DEEM supports a powerful and efficient methodology for the analytical dependability modeling and evaluation of Multiple Phased Systems, based on Deterministic and Stochastic Petri Nets and on Markov Regenerative Processes.</dc:description>
          <dc:contributor>CiteSeerX</dc:contributor>
          <dc:publisher>IEEE Computer Society</dc:publisher>
          <dc:date>2009-05-24</dc:date>
          <dc:date>2007-11-19</dc:date>
          <dc:date>2000</dc:date>
          <dc:format>application/pdf</dc:format>
          <dc:type>text</dc:type>
          <dc:identifier>http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.1485</dc:identifier>
          <dc:source>http://bonda.cnuce.cnr.it/Documentation/Papers/file-BMCFPS00-DSN2000-76.pdf</dc:source>
          <dc:language>en</dc:language>
          <dc:relation>10.1.1.47.2594</dc:relation>
          <dc:relation>10.1.1.58.2039</dc:relation>
          <dc:rights>Metadata may be used without restrictions as long as the oai identifier remains attached to it.</dc:rights>
        </oai_dc:dc>
      </metadata>
    </record>
 
Odpovědět 10.3.2015 14:23
Avatar
Jan Vargovský
Redaktor
Avatar
Odpovídá na Tayson
Jan Vargovský:

XmlSerializer nebo klidně DataContractSe­rializer by to mohl zvládnout převést na objekt, jen by sis musel dobře oatribuovat všechny vlastnosti a pak to klidně skrz reflexi nějak mohl prohnat a vyexportovat z toho CSV.

 
Nahoru Odpovědět 10.3.2015 15:53
Avatar
Tayson
Člen
Avatar
Odpovídá na Jan Vargovský
Tayson:

Lenze ten subor neni typu xml tak neviem ci budem moct pouzit xmlserializer alebo to druhe co si pisal ... pretoze som sa to snazil vycitat cez xml ale padalo to pretoze v tych tagoch sa nachadza dc:language napriklad a to dc vyhadzuje ako chybu

 
Nahoru Odpovědět 10.3.2015 18:30
Avatar
Jan Vargovský
Redaktor
Avatar
Odpovídá na Tayson
Jan Vargovský:
dobře oatribuovat všechny vlastnosti

Ten serializer umí pracovat i s namespacema. Jen mu to musíš říct.

 
Nahoru Odpovědět 10.3.2015 18:35
Avatar
vodslon
Člen
Avatar
Odpovídá na Tayson
vodslon:

Myslím, že pokud by selhal ten XML Serializer a je to věc, kterou budeš muset řešit často, šlo by to i sepsat tupě pomocí Readeru. Já jsem řešil přesně podobnou věc a nakonec jsem musel udělat to, že jsem tam měl...string.sub­string(a ted kde jsou ty dva páry co označují jeden "objekt") no a pak v tom co jsem, tak jsem hledal zase kus řetězce a ten si uložil, že třeba hodnotu mezi <dc:creator> </dc:creator> . Je to pracný a dost amaterský, ale ve finále to fungovalo.

 
Nahoru Odpovědět 11.3.2015 8:07
Avatar
Tayson
Člen
Avatar
Odpovídá na vodslon
Tayson:

No tak sa to teraz chystam riesit ze nejakym parsovanim stringu budem vyberat hodnotu medzi tagmi a snazim sa to urobit nejako co najjednoduchsie. A potom to ukladat potrebujem do CSV. A skusal som aj stiahnut z netu nejake parsery a mal som ich vela a ani jeden nezobral tento subor ako xml a nezmenil ho na csv takze mi asi ostava ze si to spravim sam.

 
Nahoru Odpovědět 11.3.2015 9:06
Děláme co je v našich silách, aby byly zdejší diskuze co nejkvalitnější. Proto do nich také mohou přispívat pouze registrovaní členové. Pro zapojení do diskuze se přihlas. Pokud ještě nemáš účet, zaregistruj se, je to zdarma.

Zobrazeno 6 zpráv z 6.