News:

"It occured to me while drunk, so it must have been genius."

Main Menu

[Generic] Tools for managing fic publishing

Started by sarsaparilla, March 02, 2014, 10:26:42 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

sarsaparilla

(To the moderators: please move this post if it is in the wrong place.)

After the events in January I have been frighteningly energetic, as if my body had a way of turning grief into motivation. I anticipate returning to fan fiction writing fairly soon, but here's something that came out from the preparatory work -- a small Perl script that turns text files somewhat similar in format as Markdown to clean HTML, implementing the functionality provided by the rather restrictive tag set of FF.net.

The rationale for this came when I looked through my old works -- manually reformatted from text to HTML -- and noticed that there were as many typographical conventions as there were stories. And, the manual work was tedious and error-prone, in any case. I know that Brian used Markdown with some custom filter, but I never asked him about the exact details. So, in order to address this particular issue I spent one evening defining my own markup style and making a script (half an hour learning Perl, three hours poking the code to see why it didn't do what it was supposed to do) that turns it into FF.net compatible HTML.

The reason for a specific format was to have a human-readable source text in the vein of Markdown, but one that provides the feature set of FF.net and nothing else, to avoid wasted effort. Most of the format conventions are self-evident and similar to Markdown, with the exception that a single line break produces a hard line break, and two or more line breaks in succession produce a paragraph break. Also, leading whitespace on a line centers the respective header or paragraph.

The script, a sample source text that demonstrates the format, and the resulting HTML file are attached. If you, like me, are distrustful of Office, mostly publish on FF.net, and want to streamline the process, then please have a look at them to see whether they suit your needs. Since the script is written in Perl, it is easy to adjust the implemented typography according to personal preferences.

Muphrid

Interesting, your perl style is more "we need just a bunch of regexes, so that's what we're doing."  I learned about perl through work (we use a lot of perl scripts to smartly define a set of parameters for these general relativity simulations), so my perl style is very C-like.

This seems like a good time to talk about making ham sandwiches--er, I mean, the dirty work of writing, managing writing files in general, and so on.

I've mentioned before I write in LaTeX, in part because I wanted to learn some for work ultimately, so writing in it would help, and in part because at the time, there was a particularly good LaTeX to RTF converter worked well with FFN.  That converter doesn't really work anymore, so I developed a script to convert LaTeX in the style I write with to common formats. If you use characters with diacritics, LaTeX is pretty easy, because all of these can be input with character sequences from a common English keyboard.  Of course, having had to develop my own script for this, I've had to manually input all these sequences to the script, so it's kinda self-defeating except when using LaTeX for making PDFs.  Naturally, I've not found any real use for these PDFs, pretty though they may be.

I also developed a script for processing HTML back to plain text or bbcode.  It was mainly for giving c&c for Brian, and I quickly realized that Arakawa's HTML style clashed with it pretty hard, but it's still somewhat useful.  I quickly started using a kinda eclectic system where I would paste in paragraphs to separate files, leaving them in <p> tags, while putting commentary outside of <p> tags.  Then, I would process the commentary file, automatically changing <p> tags to quotes (as well as other markup to bbcode style tags).

Despite having these scripts, I used to, for the longest time, invoke the overall LaTeX stripping script manually when it was needed.  Managing a collection of writing, though, is something ideally suited to a build system, e.g. make.  I hate make with a passion (Makefiles look utterly hideous to me), and a colleague of mine had a fondness for scons.  I've since learned scons is utterly unsuitable for anything of a large scale, but writing projects are pretty small by comparison, so I developed a build system using scons for my writing directory tree.

Scons is really just a build system on top of Python, so its syntax is intuitive if you're already familiar with Python.  Using scons, I automatically process the .tex files to HTML (with a custom stylesheet, for my own website that I haven't spent enough time building really, let alone making useful for people), ff.net compatible HTML, bbcode, plaintext (kinda markdown ish in look), and PDF.  LaTeX is flexible enough that you can use the same source for several PDFs with different looks, also.  I used to use this to print out manuscripts (double-spaced, fixed width) to mark up by hand, but I don't do this anymore.  Still, it has the capability.

Slap on some repository management system of your choice (most people use git, I know, but at work we used darcs for some things, so I started using that; even they don't use darcs anymore, but I like some things about darcs, so it's staying put), and presto!  A useful system for managing writing files (including their past versions) and processing them to useful formats.

For the interested, I've attached my a sample of my trees folder, which contains the strip-tex script (a maniacal hodgepodge that's really written more for myself, but it does do the function of processing these tex files to all the other formats; see trees/tools) as well as strip-html, the html to txt/bbcode stcript.  scons is controlled by an SConstruct script that is the master script in trees/ as well as supplementary scripts SConscriptCommon in trees/ and particular SConscript scripts in trees/src/whatever for each project "whatever".  The latter scripts usually hold only particular variables needed for HTML files (e.g. titles, subtitles, etc. that don't go in the latex files).  Finally, this sample contains the source files for The Coin for a building sample.

The LaTeX itself is a very cool for abstracting certain elements out and giving you the global ability to change how you format things.  For instance, I've often used a \thought command to denote a character's thoughts as distinct from narration.  This command is user-defined, so I can use it to format thoughts however I please.  If I decide I don't want any decoration, I can have it do no formatting.  If I want it to be italicized, I just change the definition.  This gives a cool way to group text based on its function.  In theory you could do this with dialogue, too, but I haven't usually done this.  Still, the ability to define simple commands (and redefine them, if necessary) is pretty powerful.  Of course, since I wrote my own script to process things, most of these definitions don't change.

To test out the build system, just do "$ scons" in the top level of trees_sample, and all of The Coin associated files (html, ffn compatible html, plain text, bbcode, and pdf) should build.  "$ scons -c" will clean everything out.

sarsaparilla

Quoteyour perl style is more "we need just a bunch of regexes, so that's what we're doing."

Or more precisely, "I put in enough effort to learn just one command, so that better suffice for everything I want to do." >_> The substitution approach felt natural since I've been making html out of text files by using the search/replace command of the text editor. As I mentioned, I am not a programmer.

Anyway, thanks for these guidelines and insights. I was vaguely aware of some of the tools you mentioned, some others I had never heard of. I don't know whether I'll ever use actual management software for my stories; there aren't that many files anyway, and I'm afraid of losing my work if I try to use a tool I don't quite understand. I'll take a look at your setup, anyway, so thank you for sharing.

I have also changed the title of the thread to better reflect the contents.


Muphrid

Yeah, as far as managing stuff goes, it may work better (or make more sense) when you have several computers or can host a copy of your work elsewhere.  I'm somewhat frequently away from home, and while I could work on writing over a remote connection, if it's for a prolonged amount of time or if the internet's kinda spotty, I'll just pull a local copy, work on that, record the changes, then push the changes to the server.  And I keep the copy on my desktop separate from the copy being served, so that if somehow I pushed something bad that munges the served copy, I still have the master desktop copy as a backup.

Anyway, I kinda went through with all this management and stuff because it served a good dual-purpose of letting me figure out stuff about work, too, so that was a plus.  I swear I'm not really a programmer; I'm just a scientist imitating one. Honest!

Muphrid

#4
So I actually ended up tinkering with strip-html and put a line in to eliminate most of FFN's cruft.  This version of strip-html is serviceable for converting an FFN-style HTML document to plaintext or bbcode.

Edit: nevermind, will have to test that more thoroughly.  Apparently FFN is more prone to ridiculous variation in its html than I could've anticipated.

Dracos

Wow, using scons to manage fanfic.

Anyhow, Brian used a custom script setup I think created by Jon, called Aquilla or whatnot. 

And...I forgot my train of thought over the day ago since this post was started.

I blame The Sun.
Well, Goodbye.

alethiophile

To revive an ancient thread: I've just recently come across pandoc, which is better than any other markup conversion tool I've ever seen, and actually good enough to feed arbitrary input to and be confident in getting something readable. Markdown files stored under git, plus this; it can output in HTML, which should feed to FF.net properly, or (with third-party script) BBcode for publishing on forums.