bookie32 Posted March 28 Share Posted March 28 Hi guys! I have a customer that has created lots of documents with words and their meanings etc... Does anyone know how he can convert these documents into a dictionary? bookie32 Link to comment Share on other sites More sharing options...
Tripredacus Posted March 28 Share Posted March 28 What kind of documents? Link to comment Share on other sites More sharing options...
bookie32 Posted March 29 Author Share Posted March 29 Hi Tripredacus It is a long story.... This customer is a writer and has written several books here in Sweden.. In the nineties he started creating word documents with words to become a dictionary....he has been doing it since then and has God knows how many documents written in word that he wants to convert to a dictionary... I know that you can actually create your own dictionary in Word....-but he didn't go about it that way... So now he has all these word documents he wants to convert....and then create a pdf of everyting..... booki32 Link to comment Share on other sites More sharing options...
Tripredacus Posted March 29 Share Posted March 29 I'm sure 90s DOC is competely different than modern DOC/X formats. If he had used any sort of format over the years, it is possible to write a script or program to parse the files and put the information into a single file, or into a database to then generate a single file. I doubt there is any ready-made solutions for what you are looking for. Link to comment Share on other sites More sharing options...
bookie32 Posted March 31 Author Share Posted March 31 Hi again! I do thank you for your time....bit over my head writing programs... But I am grateful! bookie32 Link to comment Share on other sites More sharing options...
jaclaz Posted April 1 Share Posted April 1 The file format (i.e. them being one of the various .doc or .docx formats) is largely irrelevant, as there are many converters to plainer formats such as .txt or .csv(I have to guess Unicode as Swedish has a lot of "strange" characters), given the intended use, losing the formatting of the text might be not a problem (or it may be one, as usually bold and italic are widely used in dictionaries). The real issue is that if these .doc's are more "freestyle notes" than anything else it will be tough to write a program/script capable of separating properly the fields. Essentially a dictionary is structured as a two field database, term/definition or key/value, if there is a meaningful, possibly unique, delimiter between the two, importing/converting the files will be easy to script, still there wil be errors/edge cases and what not. Then, a dedicated "dictionary/lexicography" tool might be needed (example): https://tshwanedje.com/tshwanelex/ https://tshwanedje.com/tshwanelex/overview.html for editing/assembling/formatting. jaclaz Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now