I have just started working in MaryTTS, an open-source, multilingual text-to-speech synthesis system.
I am currently trying to add support for new language that involves importing a large XML dump (around 670mb) into mysql database. The problem have arised on this step, running the file wkdb_cleaning_up.sh throws com.mysql.jdbc.PacketTooBigException.
I have set the max_allowed_packet=1024M, which have no effect on this.
Full stack trace:
Exception in thread "main" java.io.IOException: com.mysql.jdbc.PacketTooBigException: Packet for query is too large (1151137 > 1048576). You can change this value on the server by setting the max_allowed_packet' variable.
at org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:92)
at marytts.tools.dbselection.DBHandler.loadPagesWithMWDumper(DBHandler.java:251)
at marytts.tools.dbselection.WikipediaMarkupCleaner.processWikipediaPages(WikipediaMarkupCleaner.java:1044)
at marytts.tools.dbselection.WikipediaProcessor.main(WikipediaProcessor.java:365)
Caused by: org.xml.sax.SAXException: com.mysql.jdbc.PacketTooBigException: Packet for query is too large (1151137 > 1048576). You can change this value on the server by setting the max_allowed_packet' variable.
at org.mediawiki.importer.XmlDumpReader.endElement(XmlDumpReader.java:227)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88)
... 3 more
Any folks here, who have some idea over MaryTTS?
Apologies for poor English by the way. Let me know if any details are still missing in question. I will add more if needed.
No comments:
Post a Comment