eded.js editor
Hooray!!!
About eded
Originally authored by Ken Thompson, the father of UNIX, ed is a line editor that has been installed on almost all UNIX based operating systems since the release of UNIX V1 in 1971. If you use Linux or BSD (including MacOs) you likely have some version of ed on your computer. Ed's influence is most notably seen in the command line utility grep, whose name comes from ed's global command (global/regular eexpression/print), but remnants of ed can be seen in its successors ex, vi, vim, and more recently neovim. Notable differences between eded and ed V1 include the lack of the ! command to access the shell, the addition of full javascript regular expressions, and the addition of verbose error messages that are included under the input. Reading (r and e) and writing (w) are implemented via local storage in the browser. If local storage is not enabled eded will not read or write files. The source is of eded is hosted at github. Notes on design are included at the end of this about.
Notable differences of the first version of ed from its more modern successors are the lack of marks, moves, and the inability to use input commands (a, i, and c) with the global (g) command. Substitute (s) does not take an appended global (g) flag and stops at the first match found. Additionally there were no default open ranges (lone , or ;), although addresses can be totally omitted and the defaults ('.' for most commands) will still apply.
The commands and addresses tabs are copies of the relevant sections of the UNIX V1 ed manual, except where noted. Regular expressions are implemented via Javascripts RegExp object and as such any valid search pattern in javascript will work in eded. If you are not familiar with regular expressions please refer to:
- Mozilla Developer Network
- Regular-Expression.info
- regex101.com
- PDF Tutorial - A short tutorial which concisely covers the operation of (modern) ed by Brian Kernighan
- Ed V1 manual - Scanned man pages from UNIX version 1 from Dennis M. Ritchie.
- UNIX V0 - Very early "man" pages and notes on the operation and usage of UNIX (version 0). A brief description of ed is included. There are tons of great resources and documentation on cat-v.org in general.
- openBSD ed - OpenBSD's modern ed manual.
- GNU ed - Gnu's modern ed manual.
- AIX ed - IBM's AIX ed. This is probably the best documentation of a modern ed variant.
- ed V6 source - Source from UNIX V6 of ed ported to compile on modern systems. V6 was the first version of ed written in c.
- ed V10 source - Brian Kernighans cs333 assignment to build grep from ed, with V10 source included.
- Software Tools in Pascal - Most modern eds reference the implementation descibed in this book by Plauger and Kernighan.
- UNIX source - UNIX v1/v2 source ready to be run on an emulator.
- pdp-11 - PDP11 emulator in javascript.
- pdp-11 - Another PDP11 emulator in javascript.
- Huge switch statement and lots of static methods and lots of flags - This was my intitial approach and after having the chance to look at actual ed.c source appears to be the preferred method. Not very javascript and not very useful for my education.
- Regex - Parsing ed's input can (mostly) be parsed/executed with modern regex. This seemed appropriated as Ken Thompson is largely responsible for the development and subsequent popularity of regular expressions in software development. Essentially the addressing, command, and optional print sections of ed's input can be pulled apart into capture groups using lots of lookbehind/lookahead. It ends up looking very much like an obfuscated perl competition and is not maintainable or extensible. It was; however, quite a bit of fun. There are bits of this approach left in the current version.
- Finite State Machines - The current version of eded essentially uses two FSM to parse input. The first is for addressing and the second for commands. It is a very boilerplatey approach, although I suspect I am missing some simple that might make life a little easier. The "command" FSM does next to nothing as most commands primarily just terminate after a single function call, but the hope is that is that this approach is much more flexible if eded was to be expanded (marks, moves, and multiline commands should be trivial to implement with the current architecture). The global command is implemented as a recursive call on each line that matched. The matched line array is then reconciled in the case of additions or deletions. Input (append, insert, change) has been pulled out of the FSM, as I initally thought this would be required for interacting with the user. That is not the case. Eded passes a copy of state to pass through the FSM, throws on error, and simply reverts to the orignal state as neccesary. This is somewhat in line with the original input parsing of ed. The entire line would be parsed before it was executed (source needed).
- Miscellaneous - Passing around functions instead of passing around state was considered. The use of classes vs regular objects vs modules is somewhat haphazard in the current version. I am still unsure as to the final route that should be taken. The code obviously needs more modularity and consistency; Hopefully in the near future. Eded is MIT licensed.
Notable Ed and UNIX Resources:
Notes on design
Eded should very much be considered a work in progress. It's development has primarily been an educational activity in javascript. The current version is the most functional (as in "working", not the programming paradigm), but the is in various states of refactoring and needs a thorough power washing. Below is a brief list of approaches I considered:
To understandaddressing in ed it is necessary to know that at any time there is a current line. Generally speaking, the current line is the last line affected by a command; however, the exact effect on the current line by each command is discussed under the description of the command.
- The character "." addresses the current line.
- The character "$" addresses the last line of the buffer.
- A decimal number n addresses the nth line of the buffer.
- A regular expression enclosed in slashes "/" addresses the first line found by searching toward the end of the buffer and stopping at the first line containing a string matching the regular expression. If necessary the search wraps around to the beginning of the buffer.
- A regular expression enclosed in queries "?" addresses the first line found by searching toward the beginning of the buffer and stopping at thefirst line found containing a string matching the regular expression. If necessary the search wraps around to the end of the buffer.
- An address followed by a plus sign "+" or a minus sign "-" followed by a decimal number specifies that address plus (resp. minus) the indicated number of lines. The plus sign may be omitted.
Commands may require zero, one, or two addresses. Commands which require no addresses regard the presence of an address as an error. Commands which require the presence of one address all assume a default address (often ".") but if given more than one address ignore any extras and use the last given. Commands which require two addresses have defaults in the case of zero or one address but use the last two if more than two are given.
Addresses are separated from each other typically by acomma (,). They may also be separated by a semicolon ";". In this case the current line is set to the the previous address before the next address is interpreted. This feature is used to control the starting line for forward and backward searches ("/", "?").
In the following list of ed commands, the default addresses are shown in parentheses. The parentheses are not part of the address, but are used to show that the given addresses are the default.
As mentioned, it is generally illegal for more than one command to appear on a line. However, any command may be suffixed by "p" (for "print ). In that case, the current line is printed after the command is complete.
In any two--address command, it is illegal for the first address to lie after the second address.
- (.) a The append command reads the given text and appends it after the addressed line "." is left on the last line input, if there were any, otherwise at the addressed line. Address "0" is legal for this command; text is placed at the beginning of the buffer.
- (.,.) c The change command deletes the addressed lines, thenaccepts input text which replaces these lines. "." is left at the last line input; if there were none,it is left at the first line not changed.
- (.,.) d The delete command deletes the addressed lines from the buffer. "." is left at the first line notdeleted.
- e filename The edit command causes the entire contents of the buffer to be deleted. and then the named file to beread in. "." is set to the last line of the buffer. The number of characters read is typed. Note: Eded uses localstorage to read and write files. If you have local storage disabled this command will error.
- (1,$) g/regular expression/command In the global command, the first step is to mark every line which matches the given regularexpression. Then for every such line, the given command is executed with "." set to that line. The repeated command cannot be a, g, i, or c. Note: The default addresses are originally listed as (1,s). I am assuming this was in error and have corrected it to "$".
- (.) i This command inserts given text before the addressedline. "." is left at the last line input; if there were none, at the addressed line. This command differs from the a command only in the placement of the text.
- (.,.) l The list command prints the addressed lines in an unambiguous way. Note: On modern computers there is not too much to worry about the ambiguity of special characters. By was of javascipt Eded does not except any control characters (except "Enter"), but will read and write unicode just fine. In the spirit of ed "$" is left in place of newlines.
- (.,.) p The print command prints the addressed lines. . is left at the last line printed.
- q The quit command causes ed to exit. No automatic write of a file is done.Note: There is no "quit" for eded, you can simply exit your browser if you would like to quit. The 'q' command will simply delete the buffer and start eded fresh from nothing.
- ($) r filename The read command reads in the given file after the addressed line. If no file name is given, the file last mentioned in e, r, or w commands is read. Address "0" is legal for r and causes the file to be read at the beginning of the buffer. If the read is successful, the number of characters read is typed. "." is left at the last line of the file. Note: Eded uses localstorage to read and write files. If you have local storage disabled this command will error.
- (.,.) s/regular expression/replacement/ The substitute command searches each addressed line for an occurence of the specified regular expression. On each line in which a match is found, the first (and only first, compare QED) matched string is replaced by the replacement specified. It is an error for the substitution to fail on all addressed lines. Any character other than space or newline may be used instead of "/" to delimit the regular expression and the replacement. "." is left at the last line substituted. The ampersand "&" appearing in the replacement is replaced by the regular expression that was matched. The special meaning of "&" in this context may be suppressed by preceding it by "\".
- (1,$) w filename The write command writes the addressed lines onto the given file. If no file name is given, the file last named in e, r, or w commands is written. "." is unchanged. If the command is successful, the number of characters written is typed. The line number of the addressed line is typed. "." is unchanged by this command. Note: Eded uses localstorage to read and write files. If you have local storage disabled this command will error.
- ($) = The line number of the addressed line is typed. "." is unchanged by this command.
- ! UNIX command Note: This command is unavailable in eded. Sorry.