System development company in Tachikawa, Tokyo [CONPHIC Co., Ltd.]
Blog
  • HOME »
  • Blog »
  • Data conversion

Data conversion

Corpus

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. processes and formats big data. Center for Corpus Development, National Institute for Japanese Language and Linguistics releases several kinds of corpora. Regarding modern Japanese language, “Balanced Corpus of Comtemporary Written Japanese”, “Corpus of Spontaneous Japanese” and “NINJAL Ultra-Large Web-scale Japanese Corpus” are released. In addition, a full-text search system is also released. In the Diet minutes package of the full-text search system “ひまわり”, data of the Diet minutes search system are stored. CONPHIC Co., Ltd. has an experience in making digital data of the Diet minutes. We conducted a work from OCR …

Data Entry and OCR

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. creates digital data from paper-based materials. When scanning paper-based materials and making text data by OCR, we manually correct mistakes of machine processing, analyze the tendency of OCR software and make check program to improve the quality of texts. If OCR software cannot read a file, we manually make a data. In this process, two members work on the same file and check and improve the quality by comparing their work. It is also possible to check the quality of work of OCR software and manual entry. CONPHIC Co., Ltd. has conducted …

Exceptional Character

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. creates digital data from papre-based materials, converts data into markup language and builds a database. When making digital data from old books, we often find exceptional characters. These characters are often used for name of place and person and corrupted or replaced with another character when being displayed on a browser. To display letters, we should use characters conformable to character code like Shift-JIS and UTF, but exceptional characters is not in character code. In this case, we display these characters using images, replace them with other characters or explain them. For …

Automated Formatting in Multilanguage

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. conducts automated formatting of XML documents. In order to convert (automated formatting) XML documents created manually or with a system into printing data, by making definition of XML documents and designing printing layouts, we develop automated formatting program to deal with big data of documents. It is necessary to take measures when we conduct automated formatting in multiple languages with the same layout since the length of sentences are different. In addition, it is also possible to format automatically by making a special layout for languages which is written from right side …

Japanese Dependency Structure

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. provides a document creation supporting system and a translation supporting system. By analyzing Japanese structure, it is possible to find the tendency of documents and make it easier to integrate with translation software. It is impossible to analyze Japanese structure completely automatically. “美しき水車小屋の娘 (Japanese title of “Die schöne Müllerin”)”, a title of song collection of Schubert, is often used as an example about Japanese dependency structure. In this case, it is impossible to analyze Japanese structure automatically, that is, machine cannot find whether it means “美しき” “水車小屋” (a beutiful water mill )or …

Text Formatting

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. conducts data creation and data conversion with a focus on XML. When we make a data, it is necessary to prepare in a format and we conducts exchange and macro processing in a text editor. It is also possible to conduct these processings with Microsoft Word and Excel, but it is necessary to consider a little bit about programming. CONPHIC Co., Ltd. provides Text Conversion Tool to format your texts without charge. It is posssible to delete unnecessary new-line characters and unify the characters like one-byte characters and double-byte characters of the …

Comparison Table of Similar Sentences

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. processes and analyzes data. To manage a component of sentences (topic), by extracting similar sentences from all the sentences you have and finding the difference in the contents, we build a database of components of sentences as sample. You can make a document efficiently using these components of sentences. When making a new document, if you check whether there is a similar document in the database and find nothing there, you register it and you can make the similar document the next time. When you find a similar document, you check the …

File Conversion on Server

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. provides a document creation system and converts a file into other file formats on server based on your requests. We converts e.g. XML into HTML or PDF, text file into XML or XHTML, and furthermore we can convert files into many types like Word, Excel, Image, SVG and so on. Regarding how to manage source data and use and update these data, we make the most appropriate offer depending on your document type and use of them. It is possible to extract terms in order of appearance frequency from files and make …

Automated Formatting

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. conducts automated formatting of XML documents. We develop XSL-FO to reproduce the same contents as a printed document you have and save your cost and time you paid for DTP work. Automated formatting requires good taste in document layout and craftsmanship. Since we have a lot of experiences in automated formatting using SVG if necessary and choosing a font, we can conduct automated formatting of many types of documents. I think there are more and more documents which is created with a technology of automated formatting e.g. public documents of the government, …

Japanese Characteristic Writing Style

Systems Development Corporation in Tachikawa, Tokyo CONPHIC Co., Ltd. Thank you for visiting our blog. CONPHIC Co., Ltd. makes digital data and XML data of books. It is usual to write vertically Japanese sentences in books, but when we make digital data of them, it is usual to deal with and display sentences as horizontal writing. So we need devise to duplicate a paper-based book as it is. CONPHIC Co., Ltd. develops a style sheet for digital (XML) data of sentences and formats the data to display as vertical writing sentences on a browser, to print them and to make a digital books like ePub. CONPHIC Co., Ltd. has a …

« 1 45 46 47 52 »

Archives

April 2024
M T W T F S S
« Apr    
1234567
891011121314
15161718192021
22232425262728
2930  
PAGETOP
©2014-2024 CONPHIC Co., Ltd.