Emsa HTML Tag Remover - HTML Removal utility for WindowsFreeware for Windows Operating Systems supported: Windows XP, Windows Vista, Windows 7, Windows 8, Windows 8.1, Windows 10
Current version: 1.0.20 - (zip file, 214 KB)
This program is FREEWARE. Please refer to the license.txt file for more information.
(NT Compatibility notice: The program runs normally under Windows NT4 but the Gui may not appear correctly.)
Emsa HTML Tag Remover is a software utility that allows removing html tags from a html file with some extra degree of control on how the html is removed and whitespace removal as well. It provides several options to remove different types of data from the html page. It allows whitespace removal, making the resulting text output condensed as necessary. Finally, it works both in interactive mode, as well as in command line mode, which can be useful for users wanting to use this functionality from other programs or batch files.
When the program is ran for the first time, all options are checked by default. This is the most advanced removal and the result will actually be only one line of text with no carriage returns. We recommend the user to take the time and play with it in order to see the effect of each particular setting, until the output becomes as the user wants it.
Users perform the html removal by selecting an input html file, selecting or typing an output file (usually as text file) and then clicking on 'remove html' button. Please also view the command line mode below.
Some of these options are pretty self-explanatory. Here are the ones that may need a bit of explanation:
- Remove all tags strips all remaining html data comprised between < and > characters.
- Generate foreign & special characters. This is a function that allows rendering foreign and special html sequences into their corresponding resulting character. for example the '£' sequence will be rendered as the corresponding '?' pound character; the '>' sequence will be rendered as the corresponding '>' character; and so on.
- Remove spaces will trim all unnecesary spaces from the output; therefore between two words there will always be one space max.
- Remove blank lines will cut out all unneded carriage returns and line feeds, in such a way that all lines having some text will be aligned one after another with no blank lines in between.
- Finally, remove all lines will cut out carriage returns and line feeds, and the resulting data will be one line of text. this may be useful when only words are needed, like for data to be included in some search database.
The program is pretty simple so the above instructions should be enough.
Command line mode:
When extra parameters are sent to the program in command line, the program does not show the gui. Instead, it silently processses the data, generates the output file and exits. This is especially useful for batch file processing.
Command line Syntax: htmlrem.exe inputfile outputfile
For best results, please specify the inputfile and outputfile with full paths. if full paths are not specified, the program will look for input and generate output into the same directory. the input and output filename parameters must not contain spaces.
To configure the program options for commandline, first run the program once normally, set your options and close it. The program will save configuration in system registry, and when called afterwards in command line, it will use the same settings.
No Installation is required. Simply unzip the files, and run the main executable. Please read the included text files before actually running the executable.
If you have suggestions, bugs spotted, requests for new features, please email us. Thanks!
Any usual Windows machine with one mbyte of disk space will do. 800x600 min screen resolution.
Support and feedback
Please refer to the license.txt file for information about support and feedback.
Please read the license.txt for licensing information.
Emsa HTML Tag Remover: HTML Removal utility for Windows: Freeware for Windows