Jump to content

Converting UTF-8 HTML to ASCII?


bizzybody

Recommended Posts

I want to take UTF-8 encoded HTML files where all the punctuation is Unicode UTF-8 encoded with strings like 040; which is an open parentheses, (space inserted to prevent this forum software from converting it to ASCII*) and convert all those strings to ASCII equivalents.

Why? I'm converting them to TealDoc format for my PDA, and there's no conversion program I've found that understands HTML Unicode character strings. They all get replaced with question marks, or removed completely- with whatever was on either side of them right against each other.

I've found plenty of perl and python scripts, linux programs, Mac programs, pseudocode fragments and examples for Java, Javascript, C, and just about every programming language on Earth.

I am not a programmer.

I just want a simple little WINDOWS program that I can feed a file to and get a file output with the unicode strings converted to their nearest ASCII equivalents.

For examples, convert any character with diacritical marks above or below into the same case ASCII plain character. En-dashes and Em-dashes into hyphens. Ellipsis into ... Left and right quotes into straight up plain quotes.

I have Office XP. Is there some addon or plugin or something for Word that will make it convert the encoding of an HTML file from UTF-8 into plain old ordinary ASCII, doing the character conversions the way I want instead of replacing them all with question marks or HTML tags like <em>?

*I tried enclosing some UTF-8 strings in

 UBB tags and this forum software still converted them to normal characters. I thought the code tags were supposed to make forum software leave whatever's between them alone? I also discovered that entering a UTF-8 string into Google will search for that character not that specific string of punctuation and numbers.
Link to comment
Share on other sites


Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...