Merge pull request #154 from lonekorean/v3

v3
2026-06-05 15:09:59 +09:00 · 2025-03-18 14:54:39 -04:00
parent cd5155cde0 2d33577664
commit 18c8593f7b
30 changed files with 1550 additions and 2112 deletions
@@ -1,17 +0,0 @@
-{
-	"env": {
-		"commonjs": true,
-		"es6": true,
-		"node": true
-	},
-	"extends": "eslint:recommended",
-	"globals": {
-		"Atomics": "readonly",
-		"SharedArrayBuffer": "readonly"
-	},
-	"parserOptions": {
-		"ecmaVersion": 2018
-	},
-	"rules": {
-	}
-}
@@ -0,0 +1,11 @@
+# How to Contribute
+
+Contributions are welcome! Thank you!
+
+## General Guidelines
+
+Some quick notes when making a pull request.
+
+- Match the style and formatting of the code you are editing.
+- Each pull request should be focused on a single thing (a single bug fix, a single feature, etc.). This makes reviewing easier and minimizes merge conflicts.
+- Include a description of the problem being solved and what your code does. Steps to reproduce the problem or example input/output are very helpful.
@@ -1,23 +0,0 @@
-# How to Contribute
-
-Contributions are welcome! Thank you!
-
-## General Guidelines
-
-Some quick notes when making a pull request.
-
- Match the style and formatting of the code you are editing.
- Each pull request should be focused on a single thing (a single bug fix, a single feature, etc.). This makes reviewing easier and minimizes merge conflicts.
- Include a description of the problem being solved and what your code does. Steps to reproduce the problem or example input/output are very helpful.
-
-## Adding Options
-
-Keeping the wizard as short as possible is a priority. Pull requests that add options to the wizard will probably not be accepted. Instead, you can add an advanced setting to [settings.js](https://github.com/lonekorean/wordpress-export-to-markdown/blob/master/src/settings.js).
-
-## Adding Frontmatter Fields
-
-Similarly, default frontmatter output is limited to just a few widely used fields to avoid bloat. However, you may add new optional frontmatter fields.
-
-To do so, follow the instructions in [/src/frontmatter/example.js](https://github.com/lonekorean/wordpress-export-to-markdown/blob/master/src/frontmatter/example.js).
-
-Users will be able to include your new frontmatter field by editing `frontmatter_fields` in [settings.js](https://github.com/lonekorean/wordpress-export-to-markdown/blob/master/src/settings.js).
@@ -1,150 +1,247 @@
 # wordpress-export-to-markdown

-Converts a WordPress export file into Markdown files that are compatible with static site generators ([Eleventy](https://www.11ty.dev/), [Gatsby](https://www.gatsbyjs.com/), [Hugo](https://gohugo.io/), etc.).
+Converts a WordPress export XML file into Markdown files. This makes it easy to migrate from WordPress to a static site generator ([Eleventy](https://www.11ty.dev/), [Gatsby](https://www.gatsbyjs.com/), [Hugo](https://gohugo.io/), etc.).

-Each post is saved as a separate Markdown file with frontmatter. Images are downloaded and saved.
+![wordpress-export-to-markdown running in a terminal](https://github.com/user-attachments/assets/7ac1aa07-b6ee-46f4-ab49-291c1c45f350)

-![wordpress-export-to-markdown running in a terminal](https://user-images.githubusercontent.com/1245573/72686026-3aa04280-3abe-11ea-92c1-d756a24657dd.gif)
+## Features
+
+- Saves each post as a separate Markdown file with frontmatter.
+- Also saves drafts, pages, and custom post types, if you have any.
+- Downloads images and updates references to them.
+- User-friendly wizard guides you through the process.
+- Lots of command line options for configuration, if needed.

 ## Quick Start

 You'll need:
- [Node.js](https://nodejs.org/) installed
- Your [WordPress export file](https://wordpress.org/support/article/tools-export-screen/) (be sure to export "All content").

-To make things easier, you can rename your WordPress export file to `export.xml` and drop it into the same directory that you run this script from.
+- [Node.js](https://nodejs.org/) installed.
+- Your [WordPress export file](https://wordpress.org/support/article/tools-export-screen/). Be sure to export "All Content".

-You can run this script immediately in your terminal with `npx`:
+Then run this in your terminal:

 ```
 npx wordpress-export-to-markdown
 ```

-Or you can clone this repo, then from within the repo's directory, install and run:
+## Options

-```
-npm install && node index.js
-```
+The script will start with a wizard to ask you a few questions.

-Either way, the script will start a wizard to configure your options. Answer the questions and off you go!
-
-## Command Line
-
-Options can also be configured via the command line. The wizard will skip asking about any such options. For example, the following will give you [Jekyll](https://jekyllrb.com/)-style output in terms of folder structure and filenames.
-
-Using `npx`:
+Optionally, you can provide answers to any of these questions via command line arguments, in which case the wizard will skip asking those questions. Here's an example:

 ```
 npx wordpress-export-to-markdown --post-folders=false --prefix-date=true
 ```

-Using a locally cloned repo:
-
-```
-node index.js --post-folders=false --prefix-date=true
-```
-
-The wizard will still ask you about any options not specified on the command line. To skip the wizard entirely and use default values for unspecified options, add `--wizard=false`.
-
-## Options
-
-These are the questions asked by the wizard. Command line arguments, along with their default values, are also being provided here if you want to use them.
+The questions are given below, including a snippet for each one showing its command line argument set to its default value.

 ### Path to WordPress export file?

-**Command line:** `--input=export.xml`
+```
+--input=export.xml
+```

-The path to your WordPress export file. To make things easier, you can rename your WordPress export file to `export.xml` and drop it into the same directory that you run this script from.
+The path to your [WordPress export file](https://wordpress.org/documentation/article/tools-export-screen/). To make things easier, you can rename it to `export.xml` and drop it into the same directory that you run the script from.
+
+Allowed values:
+
+- Any path to a file that exists.
+
+### Put each post into its own folder?
+
+```
+--post-folders=true
+```
+
+Whether or not to create a separate folder for each post's Markdown file (and images).
+
+Allowed values:
+
+- `true` - A folder is created for each post, with an `index.md` file and `/images` folder within. The post slug is used to name the folder.
+- `false` - The post slug is used to name each post's Markdown file. These files are all saved in the same folder. All images are saved in a shared `/images` folder.
+
+### Add date prefix to posts?
+
+```
+--prefix-date=false
+```
+
+Whether or not to prepend the post date when naming a post's folder or file.
+
+Allowed values:
+
+- `true` - Prepend the date, in the format `<year>-<month>-<day>`. Nothing will be prepended if there is no date (for example, an undated draft post).
+- `false` - Don't prepend the date.
+
+### Organize posts into date folders?
+
+```
+--date-folders=none
+```
+
+If and how output is organized into folders based on date.
+
+Allowed values:
+
+- `year` - Output is organized into folders by year. This won't happen for posts with no date (for example, an undated draft post).
+- `year‑month` - Output is organized into folders by year, then into nested folders by month. Again, for posts with no date, this won't happen.
+- `none` - No date folders are created.
+
+### Save images?
+
+```
+--save-images=all
+```
+
+Which images you want to download and save.
+
+Allowed values:
+
+- `attached` - Save images attached to posts. Generally speaking, these are images that were uploaded by using **Add Media** or **Set Featured Image** in WordPress.
+- `scraped` - Save images scraped from `<img>` tags in post body content. The `<img>` tags are updated to point to where the images are saved.
+- `all` - Save all images, essentially the results of `attached` and `scraped` combined.
+- `none` - Don't save any images.
+
+## Advanced Options
+
+These are not included in the wizard, so you'll need to set them on the command line.
+
+### Use wizard?
+
+```
+--wizard=true
+```
+
+Whether or not to use the wizard.
+
+Allowed values:
+
+- `true` - The script will start with a wizard to ask five questions (the ones from the [Options](#options) section) minus any that were answered on the command line.
+- `false` - Skip wizard. Options set via command line are taken, while the rest have their default values used.

 ### Path to output folder?

-**Command line:** `--output=output`
+```
+--output=output
+```

-The path to the output directory where Markdown and image files will be saved. If it does not exist, it will be created.
+The path to the output folder where files will be saved. It'll be created if it doesn't exist. Existing files there won't be overwritten and won't be downloaded again. This lets you resume progress by restarting the script, if it was previously terminated early. To start clean, delete the output folder.

-### Create year folders?
+Allowed values:

-**Command line:** `--year-folders=false`
+- Any valid folder path.

-Whether or not to organize output files into folders by year.
+### Frontmatter fields?

-### Create month folders?
+```
+--frontmatter-fields=title,date,categories,tags,coverImage,draft
+```

-**Command line:** `--month-folders=false`
+Comma separated list of the frontmatter fields to include in Markdown files. Order is preserved. If a post doesn't have a value for a field, it is left off.

-Whether or not to organize output files into folders by month. You'll probably want to combine this with `--year-folders` to organize files by year then month.
+Allowed values:

-### Create a folder for each post?
+- A comma separated list with any of the following: `author`, `categories`, `coverImage`, `date`, `draft`, `excerpt`, `id`, `slug`, `tags`, `title`, `type`. You can rename a field by appending `:` and the alias to use. For example, `date:created` will rename `date` to `created`.

-**Command line:** `--post-folders=true`
+### Delay between image file requests?

-Whether or not to save files and images into post folders.
+```
+--request-delay=500
+```

-If `true`, the post slug is used for the folder name and the post's Markdown file is named `index.md`. Each post folder will have its own `/images` folder.
+Time (in milliseconds) to wait between requesting image files. Increasing this might help if you see timeouts or server errors.

-    /first-post
-        /images
-            potato.png
-        index.md
-    /second-post
-        /images
-            carrot.jpg
-            celery.jpg
-        index.md
+Allowed values:

-If `false`, the post slug is used to name the post's Markdown file. These files will be side-by-side and images will go into a shared `/images` folder.
+- Any positive integer.

-    /images
-        carrot.jpg
-        celery.jpg
-        potato.png
-    first-post.md
-    second-post.md
+### Delay between writing markdown files?

-Either way, this can be combined with with `--year-folders` and `--month-folders`, in which case the above output will be organized under the appropriate year and month folders.
+```
+--write-delay=10
+```

-### Prefix post folders/files with date?
+Time (in milliseconds) to wait between saving Markdown files. Increasing this might help if your file system becomes overloaded.

-**Command line:** `--prefix-date=false`
+Allowed values:

-Whether or not to prepend the post date to the post slug when naming a post's folder or file.
+- Any positive integer.

-If `--post-folders` is `true`, this affects the folder.
+### Timezone to apply to date?

-    /2019-10-14-first-post
-        index.md
-    /2019-10-23-second-post
-        index.md
+```
+--timezone=utc
+```

-If `--post-folders` is `false`, this affects the file.
+The timezone applied to post dates.

-    2019-10-14-first-post.md
-    2019-10-23-second-post.md
+Allowed values:

-### Save images attached to posts?
+- Any valid timezone as [specified here](https://moment.github.io/luxon/#/zones?id=specifying-a-zone).

-**Command line:** `--save-attached-images=true`
+### Include time with frontmatter date?

-Whether or not to download and save images attached to posts. Generally speaking, these are images that were uploaded by using **Add Media** or **Set Featured Image** in WordPress. Images are saved into `/images`.
+```
+--include-time=false
+```

-### Save images scraped from post body content?
+Whether or not time should be included with the date in frontmatter.

-**Command line:** `--save-scraped-images=true`
+Allowed values:

-Whether or not to download and save images scraped from `<img>` tags in post body content. Images are saved into `/images`. The `<img>` tags are updated to point to where the images are saved.
+- `true` - Time is included using an ISO 8601-compliant format. For example, `2020-12-25T11:20:35.000Z`.
+- `false` - Time is not included. For example, `2020-12-25`.

-### Include custom post types and pages?
+### Frontmatter date format string?

-**Command line:** `--include-other-types=false`
+```
+--date-format=""
+```

-Some WordPress sites make use of a `"page"` post type and/or custom post types. Set this to `true` to include these post types in the output. Posts will be organized into post type folders.
+A custom formatting string to apply to frontmatter dates. If set, takes precedence over `--include-time`. An empty string (the default) is ignored, resulting in the basic `<year>-<month>-<day>` format.

-## Customizing Frontmatter and Other Advanced Settings
+Allowed values:

-You can edit [settings.js](https://github.com/lonekorean/wordpress-export-to-markdown/blob/master/src/settings.js) to configure advanced settings beyond the options above. This includes things like customizing frontmatter, date formatting, throttling image downloads, and more.
+- Any valid custom formatting string. See [this table of tokens](https://moment.github.io/luxon/#/parsing?id=table-of-tokens).

-You'll need to run the script locally (not using `npx`) to edit these advanced settings.
+### Wrap frontmatter date in quotes?
+
+```
+--quote-date=false
+```
+
+Whether or not to put double quotes around the date when writing it to frontmatter.
+
+Allowed values:
+
+- `true` - Adds double quotes. This technically turns the date into a string value.
+- `false` - Doesn't add double quotes.
+
+### Use strict SSL?
+
+```
+--strict-ssl=true
+```
+
+Whether or not to use strict SSL when downloading images.
+
+Allowed values:
+
+- `true` - Use strict SSL. This is the safer option.
+- `false` - Don't use strict SSL. This will let you avoid the "self-signed certificate" error when working with a self-signed server. Just make sure you know what you're doing.
+
+## Local Development
+
+You can install and run this script locally if you want to tinker with it:
+
+1. `git clone` this repo.
+2. `cd` into the repo directory.
+3. Run `npm install`.
+
+Now instead of running `npx wordpress-export-to-markdown` you can run `node app`. They both take all the same command line arguments in the same way.

 ## Contributing

-Please read the [contribution guidelines](https://github.com/lonekorean/wordpress-export-to-markdown/blob/master/CONTRIBUTING.md).
+Please read the [contribution guidelines](https://github.com/lonekorean/wordpress-export-to-markdown/blob/master/.github/CONTRIBUTING.md).
@@ -0,0 +1,38 @@
+#!/usr/bin/env node
+
+import chalk from 'chalk';
+import * as commander from 'commander';
+import path from 'path';
+import * as intake from './src/intake.js';
+import * as parser from './src/parser.js';
+import * as shared from './src/shared.js';
+import * as writer from './src/writer.js';
+
+(async () => {
+	// configure command line help output
+	commander.program
+		.name('npx wordpress-export-to-markdown')
+		.helpOption('-h, --help', 'See the thing you\'re looking at right now')
+		.addHelpText('after', '\nMore documentation is at https://github.com/lonekorean/wordpress-export-to-markdown')
+		.configureHelp({
+			styleOptionTerm: (str) => str.replace(/(<.*>)$/, chalk.gray('$1')),
+			styleOptionDescription: (str) => str.replace(/(\(.*\))$/, chalk.gray('$1'))
+		});
+		
+	// gather config options from command line and wizard
+	await intake.getConfig();
+
+	// parse data from XML and do Markdown translations
+	const posts = await parser.parseFilePromise()
+
+	// write files and download images
+	await writer.writeFilesPromise(posts);
+
+	// happy goodbye
+	console.log('\nAll done!');
+	console.log('Look for your output files in: ' + path.resolve(shared.config.output));
+})().catch((ex) => {
+	// sad goodbye
+	console.log('\nSomething went wrong, execution halted early.');
+	console.error(ex);
+});
@@ -1,27 +0,0 @@
-#!/usr/bin/env node
-
-const path = require('path');
-const process = require('process');
-
-const wizard = require('./src/wizard');
-const parser = require('./src/parser');
-const writer = require('./src/writer');
-
-(async () => {
-	// parse any command line arguments and run wizard
-	const config = await wizard.getConfig(process.argv);
-
-	// parse data from XML and do Markdown translations
-	const posts = await parser.parseFilePromise(config)
-
-	// write files, downloading images as needed
-	await writer.writeFilesPromise(posts, config);
-
-	// happy goodbye
-	console.log('\nAll done!');
-	console.log('Look for your output files in: ' + path.resolve(config.output));
-})().catch(ex => {
-	// sad goodbye
-	console.log('\nSomething went wrong, execution halted early.');
-	console.error(ex);
-});
@@ -2,7 +2,7 @@
 	"name": "wordpress-export-to-markdown",
 	"version": "2.4.2",
 	"description": "Converts a WordPress export XML file into Markdown files.",
-	"main": "index.js",
+	"main": "app.js",
 	"repository": "https://github.com/lonekorean/wordpress-export-to-markdown.git",
 	"keywords": [
 		"blog",
@@ -11,30 +11,23 @@
 		"markdown",
 		"wordpress"
 	],
-	"scripts": {
-		"test": "echo \"Error: no test specified\" && exit 1"
-	},
 	"author": "Will Boyd <will@codersblock.com> (https://codersblock.com)",
 	"license": "MIT",
 	"engines": {
-		"node": ">= 18.0.0"
+		"node": ">= 20.5.0"
 	},
+	"type": "module",
 	"dependencies": {
-		"axios": "^1.7.9",
-		"camelcase": "^6.3.0",
-		"chalk": "^4.1.2",
-		"commander": "^13.0.0",
-		"inquirer": "^8.2.6",
+		"@guyplusplus/turndown-plugin-gfm": "^1.0.7",
+		"@inquirer/prompts": "^7.4.0",
+		"axios": "^1.8.2",
+		"chalk": "^5.4.1",
+		"commander": "^13.1.0",
 		"luxon": "^3.5.0",
-		"require-directory": "^2.1.1",
 		"turndown": "^7.2.0",
-		"turndown-plugin-gfm": "^1.0.2",
 		"xml2js": "^0.6.2"
 	},
-	"devDependencies": {
-		"eslint": "^8.57.1"
-	},
 	"bin": {
-		"wordpress-export-to-markdown": "./index.js"
+		"wordpress-export-to-markdown": "./app.js"
 	}
 }
@@ -0,0 +1,108 @@
+import xml2js from 'xml2js';
+
+class Data {
+	#obj;
+	#expression;
+
+	constructor(obj, expression) {
+		// xml2js returns leaf nodes as strings, turn those into consistent objects
+		// I found this to be safer and more efficient than using the explicitCharkey option
+		this.#obj = typeof obj === 'string' ? { _: obj } : obj;
+
+		// this identifies how the object was referenced, helps a ton with debugging
+		this.#expression = expression;
+	}
+
+	#buildExpression(propName, index = undefined) {
+		let expression = `${this.#expression}.${propName}`;
+		if (index !== undefined) {
+			expression += `[${index}]`;
+		}
+
+		return expression;
+	}
+
+	// used by "optional" functions to return undefined instead of throwing an error
+	#optional(func) {
+		try {
+			return func();
+		} catch (ex) {
+			return undefined;
+		}
+	}
+
+	// will not throw an error if property doesn't exist, defaults to empty array
+	children(propName) {
+		const nodes = this.#obj[propName] ?? [];
+		return nodes.map((value, index) => new Data(value, this.#buildExpression(propName, index)));
+	}
+
+	// throws an error if property (or index on property) doesn't exist
+	child(propName, index = 0) {
+		const nodes = this.#obj[propName];
+		if (nodes === undefined) {
+			throw new Error(`Could not find ${this.#buildExpression(propName)}.`);
+		}
+
+		const node = nodes[index];
+		if (node === undefined) {
+			throw new Error(`Could not find ${this.#buildExpression(propName, index)}.`);
+		}
+
+		return new Data(node, this.#buildExpression(propName, index));
+	}
+
+	// convenience function, since it's very common to want the value of a child
+	childValue(propName, index = 0) {
+		return this.child(propName, index).value();
+	}
+	
+	// throws an error if this object doesn't have a value string
+	value() {
+		const value = this.#obj._;
+		if (value === undefined) {
+			throw new Error(`Could not get value from ${this.#expression}.`);
+		}
+
+		return value;
+	}
+
+	// throws an error if attribute does not exist
+	attribute(attrName) {
+		const attribute = this.#obj.$?.[attrName];
+		if (attribute === undefined) {
+			throw new Error(`Could not get attribute ${attrName} from ${this.#expression}.`);
+		}
+
+		return attribute;
+	}
+
+	optionalChild(propName, index = 0) {
+		return this.#optional(() => this.child(propName, index));
+	}
+
+	optionalChildValue(propName, index = 0) {
+		return this.#optional(() => this.childValue(propName, index));
+	}
+
+	optionalValue() {
+		return this.#optional(() => this.value());
+	}
+}
+
+export async function load(content) {
+	const rootData = await xml2js.parseStringPromise(content, {
+		tagNameProcessors: [xml2js.processors.stripPrefix],
+		trim: true
+	}).catch((ex) => {
+		ex.message = 'Could not parse XML. This likely means your import file is malformed.\n\n' + ex.message;
+		throw ex;
+	});
+
+	const rssData = rootData.rss;
+	if (rssData === undefined) {
+		throw new Error('Could not find <rss> root node. This likely means your import file is malformed.')
+	}
+
+	return new Data(rssData, 'rss');
+}
@@ -0,0 +1,63 @@
+export function author(post) {
+	// not decoded (WordPress doesn't allow funky characters in usernames anyway)
+	// surprisingly, does not always exist (squarespace exports, for example)
+	return post.data.optionalChildValue('creator');
+}
+
+export function categories(post) {
+	// array of decoded category names, excluding 'uncategorized'
+	const categories = post.data.children('category');
+	return categories
+		.filter((category) => category.attribute('domain') === 'category' && category.attribute('nicename') !== 'uncategorized')
+		.map((category) => decodeURIComponent(category.attribute('nicename')));
+}
+
+export function coverImage(post) {
+	// cover image filename, previously parsed and decoded
+	return post.coverImage;
+}
+
+export function date(post) {
+	// a luxon datetime object, previously parsed
+	return post.date;
+}
+
+export function draft(post) {
+	// boolean representing the previously parsed draft status, only included when true
+	return post.isDraft ? true : undefined;
+}
+
+export function excerpt(post) {
+	// not decoded, newlines collapsed
+	// does not always exist (squarespace exports, for example)
+	const encoded = post.data.optionalChildValue('encoded', 1);
+	return encoded ? encoded.replace(/[\r\n]+/gm, ' ') : undefined;
+}
+
+export function id(post) {
+	// previously parsed as a string, converted to integer here
+	return parseInt(post.id);
+}
+
+export function slug(post) {
+	// previously parsed and decoded
+	return post.slug;
+}
+
+export function tags(post) {
+	// array of decoded tag names (yes, they come from <category> nodes, not a typo)
+	const categories = post.data.children('category');
+	return categories
+		.filter((category) => category.attribute('domain') === 'post_tag')
+		.map((category) => decodeURIComponent(category.attribute('nicename')));
+}
+
+export function title(post) {
+	// not decoded
+	return post.data.childValue('title');
+}
+
+export function type(post) {
+	// previously parsed but not decoded, can be "post", "page", or other custom types
+	return post.type;
+}
@@ -1,5 +0,0 @@
-// get author, without decoding
-// WordPress doesn't allow funky characters in usernames anyway
-module.exports = (post) => {
-	return post.data.creator[0];
-}
@@ -1,14 +0,0 @@
-const settings = require('../settings');
-
-// get array of decoded category names, filtered as specified in settings
-module.exports = (post) => {
-	if (!post.data.category) {
-		return [];
-	}
-
-	const categories = post.data.category
-		.filter(category => category.$.domain === 'category')
-		.map(({ $: attributes }) => decodeURIComponent(attributes.nicename));
-
-	return categories.filter(category => !settings.filter_categories.includes(category));
-};
@@ -1,5 +0,0 @@
-// get cover image filename, previously decoded and set on post.meta
-// this one is unique as it relies on special logic executed by the parser
-module.exports = (post) => {
-	return post.meta.coverImage;
-};
@@ -1,17 +0,0 @@
-const luxon = require('luxon');
-
-const settings = require('../settings');
-
-// get post date, optionally formatted as specified in settings
-// this value is also used for year/month folders, date prefixes, etc. as needed
-module.exports = (post) => {
-	const dateTime = luxon.DateTime.fromRFC2822(post.data.pubDate[0], { zone: settings.custom_date_timezone });
-
-	if (settings.custom_date_formatting) {
-		return dateTime.toFormat(settings.custom_date_formatting);
-	} else if (settings.include_time_with_date) {
-		return dateTime.toISO();
-	} else {
-		return dateTime.toISODate();
-	}
-};
@@ -1,19 +0,0 @@
-/*
-	1. Copy this file, rename to the frontmatter field name you want, camelcased
-	2. Edit frontmatter_fields in settings.js to include your new field name
-	3. Run the script to see post data dumps, to see what you can work with
-	4. Write your code to get and return what you want
-	5. Update "get whatever" comment to describe what you're getting
-	6. Remove your field name from frontmatter_fields in settings.js
-	7. Remove this comment block and the debug console code
-	8. Make that pull request!
-*/
-
-// get whatever
-module.exports = (post) => {
-	console.log('\nBEGIN POST DATA DUMP ===========================================================\n');
-	console.dir(post, { depth: null });
-	console.log('\nEND POST DATA DUMP =============================================================\n');
-
-	return 'EXAMPLE: ' + post.data.title[0];
-};
@@ -1,4 +0,0 @@
-// get excerpt, not decoded, newlines collapsed
-module.exports = (post) => {
-	return post.data.encoded[1].replace(/[\r\n]+/gm, ' ');
-};
@@ -1,4 +0,0 @@
-// get ID
-module.exports = (post) => {
-	return post.data.post_id[0];
-}
@@ -1,4 +0,0 @@
-// get slug, previously decoded and set on post.meta
-module.exports = (post) => {
-	return post.meta.slug;
-};
@@ -1,12 +0,0 @@
-// get array of decoded tag names
-module.exports = (post) => {
-	if (!post.data.category) {
-		return [];
-	}
-
-	const categories = post.data.category
-		.filter(category => category.$.domain === 'post_tag')
-		.map(({ $: attributes }) => decodeURIComponent(attributes.nicename));
-
-	return categories;
-};
@@ -1,4 +0,0 @@
-// get simple post title, but not decoded like other frontmatter string fields
-module.exports = (post) => {
-	return post.data.title[0];
-};
@@ -1,5 +0,0 @@
-// get type, often this will always be "post"
-// but can also be "page" or other custom types
-module.exports = (post) => {
-	return post.data.post_type[0];
-}
@@ -0,0 +1,165 @@
+import chalk from 'chalk';
+import * as commander from 'commander';
+import * as luxon from 'luxon';
+import path from 'path';
+import * as normalizers from './normalizers.js';
+import * as questions from './questions.js';
+import * as shared from './shared.js';
+
+// visual formatting for wizard
+const promptTheme = {
+	prefix: {
+		idle: chalk.gray('\n?'),
+		done: chalk.green('✓')
+	},
+	style: {
+		description: (text) => chalk.gray('example: ' + text)
+	}
+};
+
+export async function getConfig() {
+	// check command line for any config options
+	const commandLineQuestions = questions.load();
+	const commandLineAnswers = getCommandLineAnswers(commandLineQuestions);
+
+	let wizardAnswers;
+	if (commandLineAnswers.wizard) {
+		shared.logHeading('Starting wizard');
+
+		// run wizard for questions with prompts that were not answered via the command line
+		const wizardQuestions = questions.load().filter((question) => {
+			return question.prompt && !(shared.camelCase(question.name) in commandLineAnswers);
+		});
+		wizardAnswers = await getWizardAnswers(wizardQuestions, commandLineAnswers);
+	} else {
+		shared.logHeading('Skipping wizard');
+	}
+
+	Object.assign(shared.config, commandLineAnswers, wizardAnswers);
+}
+
+function getCommandLineAnswers(questions) {
+	// show errors in red
+	commander.program.configureOutput({
+		outputError: (str, write) => write(chalk.red(str))
+	});
+	
+	questions.forEach((question) => {
+		const option = new commander.Option('--' + question.name + ' <' + question.type + '>', question.description);
+		option.default(question.default);
+
+		if (!question.description) {
+			option.hideHelp();
+		}
+
+		if (question.choices && question.type !== 'boolean') {
+			// let commander handle non-boolean multiple choice validation
+			option.choices(question.choices.map((choice) => choice.value));
+		} else {
+			option.argParser((value) => normalize(value, question.type, (errorMessage) => {
+				throw new commander.InvalidArgumentError(errorMessage);
+			}));
+		}
+
+		commander.program.addOption(option);
+	});
+
+	const answers = commander.program.parse().opts();
+
+	// do some post-processing on the answers
+	for (const [key, value] of Object.entries(answers)) {
+		// the "wizard" answer and any user-provided (not defaulted) answers are left alone
+		if (key === 'wizard' || commander.program.getOptionValueSource(key) !== 'default') {
+			continue;
+		}
+
+		const question = questions.find((question) => shared.camelCase(question.name) === key);
+		if (answers.wizard && question.prompt) {
+			// remove this default answer, allowing the wizard to ask about it later
+			delete answers[key];
+		} else {
+			// normalize and validate default answer
+			answers[key] = normalize(value, question.type, (errorMessage) => {
+				// this is formatted to match how commander displays other errors
+				commander.program.error(`error: option '--${question.name} <${question.type}>' argument '${value}' is invalid. ${errorMessage}`);
+			});
+		}
+	}
+
+	return answers;
+}
+
+export async function getWizardAnswers(questions, commandLineAnswers) {
+	const answers = {};
+	for (const question of questions) {
+		let answerKey = shared.camelCase(question.name);
+		let normalizedAnswer; // holds normalized answer value potentially returned during validation
+
+		const promptConfig = {
+			theme: promptTheme,
+			message: question.description + '?',
+			default: question.default,
+		};
+
+		if (question.choices) {
+			promptConfig.choices = question.choices;
+			promptConfig.loop = false;
+
+			if (question.isPathQuestion) {
+				promptConfig.choices.forEach((choice) => {
+					// show example path if this choice is selected
+					choice.description = buildSamplePostPath({
+						...commandLineAnswers,		// with command line answers
+						...answers,					// and wizard answers so far
+						output: path.sep,			// and a simplified output folder
+						[answerKey]: choice.value	// and this choice selected
+					});
+				});
+			}
+		} else {
+			promptConfig.validate = (value) => {
+				let validationErrorMessage;
+				normalizedAnswer = normalize(value, question.type, (errorMessage) => {
+					validationErrorMessage = errorMessage;
+				});
+				return validationErrorMessage ?? true;
+			}
+		}
+
+		const answer = await question.prompt(promptConfig).catch((ex) => {
+			// exit gracefully if user hits ctrl + c during wizard
+			if (ex instanceof Error && ex.name === 'ExitPromptError') {
+				console.log('\nUser quit wizard early.');
+				process.exit(0);
+			} else {
+				throw ex;
+			}
+		});
+
+		answers[answerKey] = normalizedAnswer ?? answer;
+	}
+
+	return answers;
+}
+
+function normalize(value, type, onError) {
+	const normalizer = normalizers[shared.camelCase(type)];
+	if (!normalizer) {
+		return value;
+	}
+
+	try {
+		return normalizer(value);
+	} catch (ex) {
+		onError(ex.message);
+	}
+}
+
+export function buildSamplePostPath(overrideConfig) {
+	const samplePost = {
+		date: luxon.DateTime.now(),
+		slug: 'my-post'
+	};
+
+	return shared.buildPostPath(samplePost, overrideConfig);
+}
@@ -0,0 +1,49 @@
+import fs from 'fs';
+import path from 'path';
+
+export function boolean(value) {
+	if (typeof value === 'boolean') {
+		return value;
+	} else if (value === 'true') {
+		return true;
+	} else if (value === 'false') {
+		return false;
+	}
+
+	throw new Error('Must be true or false.');
+}
+
+export function filePath(value) {
+	const unwrapped = value.replace(/"(.*?)"/, '$1');
+	const absolute = path.resolve(unwrapped);
+
+	let fileExists;
+	try {
+		fileExists = fs.existsSync(absolute) && fs.statSync(absolute).isFile();
+	} catch (ex) {
+		fileExists = false;
+	}
+
+	if (fileExists) {
+		return absolute;
+	}
+
+	throw new Error('File not found at ' + absolute + '.');
+}
+
+export function list(value) {
+	if (Array.isArray(value)) {
+		return value;
+	} else {
+		return value.trim().split(/\s*,\s*/);
+	}
+}
+
+export function integer(value) {
+	const int = parseInt(value);
+	if (!Number.isNaN(int) && int >= 0) {
+		return int;
+	}
+
+	throw new Error('Must be an integer >= 0.');
+}
@@ -1,32 +1,26 @@
-const fs = require('fs');
-const requireDirectory = require('require-directory');
-const xml2js = require('xml2js');
+import chalk from 'chalk';
+import fs from 'fs';
+import * as luxon from 'luxon';
+import * as data from './data.js';
+import * as frontmatter from './frontmatter.js';
+import * as shared from './shared.js';
+import * as translator from './translator.js';

-const shared = require('./shared');
-const settings = require('./settings');
-const translator = require('./translator');
+export async function parseFilePromise() {
+	shared.logHeading('Parsing');
+	const content = await fs.promises.readFile(shared.config.input, 'utf8');
+	const rssData = await data.load(content);
+	const allPostData = rssData.child('channel').children('item');

-// dynamically requires all frontmatter getters
-const frontmatterGetters = requireDirectory(module, './frontmatter', { recurse: false });
-
-async function parseFilePromise(config) {
-	console.log('\nParsing...');
-	const content = await fs.promises.readFile(config.input, 'utf8');
-	const allData = await xml2js.parseStringPromise(content, {
-		trim: true,
-		tagNameProcessors: [xml2js.processors.stripPrefix]
-	});
-	const channelData = allData.rss.channel[0].item;
-
-	const postTypes = getPostTypes(channelData, config);
-	const posts = collectPosts(channelData, postTypes, config);
+	const postTypes = getPostTypes(allPostData);
+	const posts = collectPosts(allPostData, postTypes);

 	const images = [];
-	if (config.saveAttachedImages) {
-		images.push(...collectAttachedImages(channelData));
+	if (shared.config.saveImages === 'attached' || shared.config.saveImages === 'all') {
+		images.push(...collectAttachedImages(allPostData));
 	}
-	if (config.saveScrapedImages) {
-		images.push(...collectScrapedImages(channelData, postTypes));
+	if (shared.config.saveImages === 'scraped' || shared.config.saveImages === 'all') {
+		images.push(...collectScrapedImages(allPostData, postTypes));
 	}

 	mergeImagesIntoPosts(images, posts);
@@ -35,110 +29,135 @@ async function parseFilePromise(config) {
 	return posts;
 }

-function getPostTypes(channelData, config) {
-	if (config.includeOtherTypes) {
-		// search export file for all post types minus some default types we don't want
-		// effectively this will be 'post', 'page', and custom post types
-		const types = channelData
-			.map(item => item.post_type[0])
-			.filter(type => !['attachment', 'revision', 'nav_menu_item', 'custom_css', 'customize_changeset'].includes(type));
-		return [...new Set(types)]; // remove duplicates
-	} else {
-		// just plain old vanilla "post" posts
-		return ['post'];
-	}
+function getPostTypes(allPostData) {
+	// search export file for all post types minus some specific types we don't want
+	const postTypes = [...new Set(allPostData // new Set() is used to dedupe array
+		.map((postData) => postData.childValue('post_type'))
+		.filter((postType) => ![
+			'attachment',
+			'revision',
+			'nav_menu_item',
+			'custom_css',
+			'customize_changeset',
+			'oembed_cache',
+			'user_request',
+			'wp_block',
+			'wp_global_styles',
+			'wp_navigation',
+			'wp_template',
+			'wp_template_part'
+		].includes(postType))
+	)];
+
+	// change order to "post", "page", then all custom post types (alphabetically)
+	prioritizePostType(postTypes, 'page');
+	prioritizePostType(postTypes, 'post');
+
+	return postTypes;
 }

-function getItemsOfType(channelData, type) {
-	return channelData.filter(item => item.post_type[0] === type);
+function getItemsOfType(allPostData, type) {
+	return allPostData.filter((item) => item.childValue('post_type') === type);
 }

-function collectPosts(channelData, postTypes, config) {
-	// this is passed into getPostContent() for the markdown conversion
-	const turndownService = translator.initTurndownService();
-
+function collectPosts(allPostData, postTypes) {
 	let allPosts = [];
-	postTypes.forEach(postType => {
-		const postsForType = getItemsOfType(channelData, postType)
-			.filter(postData => postData.status[0] !== 'trash' && postData.status[0] !== 'draft')
-			.map(postData => ({
-				// raw post data, used by frontmatter getters
-				data: postData,
+	postTypes.forEach((postType) => {
+		const postsForType = getItemsOfType(allPostData, postType)
+			.filter((postData) => postData.childValue('status') !== 'trash')
+			.filter((postData) => !(postType === 'page' && postData.childValue('post_name') === 'sample-page'))
+			.map((postData) => buildPost(postData));

-				// meta data isn't written to file, but is used to help with other things
-				meta: {
-					id: getPostId(postData),
-					slug: getPostSlug(postData),
-					coverImageId: getPostCoverImageId(postData),
-					coverImage: undefined, // possibly set later in mergeImagesIntoPosts()
-					type: postType,
-					imageUrls: [] // possibly set later in mergeImagesIntoPosts()
-				},
-
-				// contents of the post in markdown
-				content: translator.getPostContent(postData, turndownService, config)
-			}));
-
-		if (postTypes.length > 1) {
-			console.log(`${postsForType.length} "${postType}" posts found.`);
+		if (postsForType.length > 0) {
+			if (postType === 'post') {
+				console.log(`${postsForType.length} normal posts found.`);
+			} else if (postType === 'page') {
+				console.log(`${postsForType.length} pages found.`);
+			} else {
+				console.log(`${postsForType.length} custom "${postType}" posts found.`);
+			}
 		}

 		allPosts.push(...postsForType);
 	});

-	if (postTypes.length === 1) {
-		console.log(allPosts.length + ' posts found.');
-	}
 	return allPosts;
 }

-function getPostId(postData) {
-	return postData.post_id[0];
+function buildPost(data) {
+	return {
+		// full raw post data
+		data,
+
+		// body content converted to markdown
+		content: translator.getPostContent(data.childValue('encoded')),
+
+		// particularly useful values for all sorts of things
+		type: data.childValue('post_type'),
+		id: data.childValue('post_id'),
+		isDraft: data.childValue('status') === 'draft',
+		slug: decodeURIComponent(data.childValue('post_name')),
+		date: getPostDate(data),
+		coverImageId: getPostMetaValue(data, '_thumbnail_id'),
+
+		// these are possibly set later in mergeImagesIntoPosts()
+		coverImage: undefined,
+		imageUrls: []
+	};
 }

-function getPostSlug(postData) {
-	return decodeURIComponent(postData.post_name[0]);
+function getPostDate(data) {
+	const date = luxon.DateTime.fromRFC2822(data.childValue('pubDate'), { zone: shared.config.timezone });
+	return date.isValid ? date : undefined;
 }

-function getPostCoverImageId(postData) {
-	if (postData.postmeta === undefined) {
-		return undefined;
+function getPostMetaValue(data, key) {
+	const metas = data.children('postmeta');
+	const meta = metas.find((meta) => meta.childValue('meta_key') === key);
+	return meta ? meta.childValue('meta_value') : undefined;
 }

-	const postmeta = postData.postmeta.find(postmeta => postmeta.meta_key[0] === '_thumbnail_id');
-	const id = postmeta ? postmeta.meta_value[0] : undefined;
-	return id;
-}
-
-function collectAttachedImages(channelData) {
-	const images = getItemsOfType(channelData, 'attachment')
+function collectAttachedImages(allPostData) {
+	const images = getItemsOfType(allPostData, 'attachment')
 		// filter to certain image file types
-		.filter(attachment => attachment.attachment_url && (/\.(gif|jpe?g|png|webp)$/i).test(attachment.attachment_url[0]))
-		.map(attachment => ({
-			id: attachment.post_id[0],
-			postId: attachment.post_parent[0],
-			url: attachment.attachment_url[0]
+		.filter((attachment) => {
+			const url = attachment.childValue('attachment_url');
+			return url && (/\.(gif|jpe?g|png|webp)$/i).test(url);
+		})
+		.map((attachment) => ({
+			id: attachment.childValue('post_id'),
+			postId: attachment.optionalChildValue('post_parent') ?? 'nope', // may not exist (cover image in a squarespace export, for example)
+			url: attachment.childValue('attachment_url')
 		}));

 	console.log(images.length + ' attached images found.');
 	return images;
 }

-function collectScrapedImages(channelData, postTypes) {
+function collectScrapedImages(allPostData, postTypes) {
 	const images = [];
-	postTypes.forEach(postType => {
-		getItemsOfType(channelData, postType).forEach(postData => {
-			const postId = postData.post_id[0];
-			const postContent = postData.encoded[0];
-			const postLink = postData.link[0];
+	postTypes.forEach((postType) => {
+		getItemsOfType(allPostData, postType).forEach((postData) => {
+			const postId = postData.childValue('post_id');
+			
+			const postContent = postData.childValue('encoded');
+			const scrapedUrls = [...postContent.matchAll(/<img(?=\s)[^>]+?(?<=\s)src="(.+?)"[^>]*>/gi)].map((match) => match[1]);
+			scrapedUrls.forEach((scrapedUrl) => {
+				let url;
+				if (isAbsoluteUrl(scrapedUrl)) {
+					url = scrapedUrl;
+				} else {
+					const postLink = postData.childValue('link');
+					if (isAbsoluteUrl(postLink)) {
+						url = new URL(scrapedUrl, postLink).href;
+					} else {
+						throw new Error(`Unable to determine absolute URL from scraped image URL '${scrapedUrl}' and post link URL '${postLink}'.`);
+					}
+				}

-			const matches = [...postContent.matchAll(/<img[^>]*src="(.+?\.(?:gif|jpe?g|png|webp))"[^>]*>/gi)];
-			matches.forEach(match => {
-				// base the matched image URL relative to the post URL
-				const url = new URL(match[1], postLink).href;
 				images.push({
-					id: -1,
-					postId: postId,
+					id: 'nope', // scraped images don't have an id
+					postId,
 					url
 				});
 			});
@@ -150,43 +169,52 @@ function collectScrapedImages(channelData, postTypes) {
 }

 function mergeImagesIntoPosts(images, posts) {
-	images.forEach(image => {
-		posts.forEach(post => {
+	images.forEach((image) => {
+		posts.forEach((post) => {
 			let shouldAttach = false;

 			// this image was uploaded as an attachment to this post
-			if (image.postId === post.meta.id) {
+			if (image.postId === post.id) {
 				shouldAttach = true;
 			}

 			// this image was set as the featured image for this post
-			if (image.id === post.meta.coverImageId) {
+			if (image.id === post.coverImageId) {
 				shouldAttach = true;
-				post.meta.coverImage = shared.getFilenameFromUrl(image.url);
+				post.coverImage = shared.getFilenameFromUrl(image.url);
 			}

-			if (shouldAttach && !post.meta.imageUrls.includes(image.url)) {
-				post.meta.imageUrls.push(image.url);
+			if (shouldAttach && !post.imageUrls.includes(image.url)) {
+				post.imageUrls.push(image.url);
 			}
 		});
 	});
 }

 function populateFrontmatter(posts) {
-	posts.forEach(post => {
-		const frontmatter = {};
-		settings.frontmatter_fields.forEach(field => {
+	posts.forEach((post) => {
+		post.frontmatter = {};
+		shared.config.frontmatterFields.forEach((field) => {
 			const [key, alias] = field.split(':');

-			let frontmatterGetter = frontmatterGetters[key];
+			let frontmatterGetter = frontmatter[key];
 			if (!frontmatterGetter) {
 				throw `Could not find a frontmatter getter named "${key}".`;
 			}

-			frontmatter[alias || key] = frontmatterGetter(post);
+			post.frontmatter[alias ?? key] = frontmatterGetter(post);
 		});
-		post.frontmatter = frontmatter;
 	});
 }

-exports.parseFilePromise = parseFilePromise;
+function prioritizePostType(postTypes, postType) {
+	const index = postTypes.indexOf(postType);
+	if (index !== -1) {
+		postTypes.splice(index, 1);
+		postTypes.unshift(postType);
+	}
+}
+
+function isAbsoluteUrl(url) {
+	return (/^https?:\/\//i).test(url);
+}
@@ -0,0 +1,158 @@
+import * as inquirer from '@inquirer/prompts';
+
+export function load() {
+	// questions with a description are displayed in command line help
+	// questions with a prompt are included in the wizard (if not set on the command line)
+	return [
+		{
+			name: 'input',
+			type: 'file-path',
+			description: 'Path to WordPress export file',
+			default: 'export.xml',
+			prompt: inquirer.input
+		},
+		{
+			name: 'post-folders',
+			type: 'boolean',
+			description: 'Put each post into its own folder',
+			default: true,
+			choices: [
+				{
+					name: 'Yes',
+					value: true
+				},
+				{
+					name: 'No',
+					value: false
+				}
+			],
+			isPathQuestion: true,
+			prompt: inquirer.select
+		},
+		{
+			name: 'prefix-date',
+			type: 'boolean',
+			description: 'Add date prefix to posts',
+			default: false,
+			choices: [
+				{
+					name: 'Yes',
+					value: true
+				},
+				{
+					name: 'No',
+					value: false
+				}
+			],
+			isPathQuestion: true,
+			prompt: inquirer.select
+		},
+		{
+			name: 'date-folders',
+			type: 'choice',
+			description: 'Organize posts into date folders',
+			default: 'none',
+			choices: [
+				{
+					name: 'Year folders',
+					value: 'year'
+				},
+				{
+					name: 'Year and month folders',
+					value: 'year-month'
+				},
+				{
+					name: 'No',
+					value: 'none'
+				}
+			],
+			isPathQuestion: true,
+			prompt: inquirer.select
+		},
+		{
+			name: 'save-images',
+			type: 'choice',
+			description: 'Save images',
+			default: 'all',
+			choices: [
+				{
+					name: 'Images attached to posts',
+					value: 'attached'
+				},
+				{
+					name: 'Images scraped from post body content',
+					value: 'scraped'
+				},
+				{
+					name: 'All Images',
+					value: 'all'
+				},
+				{
+					name: 'No',
+					value: 'none'
+				}
+			],
+			prompt: inquirer.select
+		},
+		{
+			name: 'wizard',
+			type: 'boolean',
+			description: 'Use wizard',
+			default: true
+		},
+		{
+			name: 'output',
+			type: 'folder-path',
+			description: 'Path to output folder',
+			default: 'output'
+		},
+		{
+			name: 'frontmatter-fields',
+			type: 'list',
+			description: 'Frontmatter fields',
+			default: 'title,date,categories,tags,coverImage,draft'
+		},
+		{
+			name: 'request-delay',
+			type: 'integer',
+			description: 'Delay between image file requests',
+			default: 500
+		},
+		{
+			name: 'write-delay',
+			type: 'integer',
+			description: 'Delay between writing markdown files',
+			default: 25
+		},
+		{
+			name: 'timezone',
+			type: 'string',
+			description: 'Timezone to apply to date',
+			default: 'utc'
+		},
+		{
+			name: 'include-time',
+			type: 'boolean',
+			description: 'Include time with frontmatter date',
+			default: false
+		},
+		{
+			name: 'date-format',
+			type: 'string',
+			description: 'Frontmatter date format string',
+			default: ''
+		},
+		{
+			name: 'quote-date',
+			type: 'boolean',
+			description: 'Wrap frontmatter date in quotes',
+			default: false
+		},
+		{
+			name: 'strict-ssl',
+			type: 'boolean',
+			description: 'Use strict SSL',
+			default: true
+		}
+	];
+}
@@ -1,40 +0,0 @@
-// Which fields to include in frontmatter. Look in /src/frontmatter to see available fields.
-// Order is preserved. If a field has an empty value, it will not be included. You can rename a
-// field by providing an alias after a ':'. For example, 'date:created' will include 'date' in
-// frontmatter, but renamed to 'created'.
-exports.frontmatter_fields = [
-	'title',
-	'date',
-	'categories',
-	'tags',
-	'coverImage'
-];
-
-// Time in ms to wait between requesting image files. Increase this if you see timeouts or
-// server errors.
-exports.image_file_request_delay = 500;
-
-// Time in ms to wait between saving Markdown files. Increase this if your file system becomes
-// overloaded.
-exports.markdown_file_write_delay = 25;
-
-// Enable this to include time with post dates. For example, "2020-12-25" would become
-// "2020-12-25T11:20:35.000Z".
-exports.include_time_with_date = false;
-
-// Override post date formatting with a custom formatting string (for example: 'yyyy LLL dd').
-// Tokens are documented here: https://moment.github.io/luxon/#/parsing?id=table-of-tokens. If
-// set, this takes precedence over include_time_with_date.
-exports.custom_date_formatting = '';
-
-// Specify the timezone used for post dates. See available zone values and examples here:
-// https://moment.github.io/luxon/#/zones?id=specifying-a-zone.
-exports.custom_date_timezone = 'utc';
-
-// Categories to be excluded from post frontmatter. This does not filter out posts themselves,
-// just the categories listed in their frontmatter.
-exports.filter_categories = ['uncategorized'];
-
-// Strict SSL is enabled as the safe default when downloading images, but will not work with
-// self-signed servers. You can disable it if you're getting a "self-signed certificate" error.
-exports.strict_ssl = true;
@@ -1,4 +1,77 @@
-function getFilenameFromUrl(url) {
+import chalk from 'chalk';
+import path from 'path';
+
+// simple data store, populated via intake, used everywhere
+export const config = {};
+
+export function camelCase(str) {
+	return str.replace(/-(.)/g, (match) => match[1].toUpperCase());
+}
+
+export function getSlugWithFallback(post) {
+	return post.slug ? post.slug : 'id-' + post.id;
+}
+
+export function logHeading(text) {
+	console.log(`\n${chalk.cyan(text + '...')}`);
+}
+
+export function buildPostPath(post, overrideConfig) {
+	const pathConfig = overrideConfig ?? config;
+
+	// start with output folder
+	const pathSegments = [pathConfig.output];
+
+	// add folder for post type if exists
+	if (post.type) {
+		switch (post.type) {
+			case 'post':
+				pathSegments.push('posts');
+				break;
+			case 'page':
+				pathSegments.push('pages');
+				break;
+			default:
+				pathSegments.push('custom');
+				pathSegments.push(post.type);	
+		}
+	}
+
+	// add drafts folder if this is a draft post
+	if (post.isDraft) {
+		pathSegments.push('_drafts');
+	}
+
+	// add folders for date year/month as appropriate
+	if (post.date) {
+		if (pathConfig.dateFolders === 'year' || pathConfig.dateFolders === 'year-month') {
+			pathSegments.push(post.date.toFormat('yyyy'));
+		}
+
+		if (pathConfig.dateFolders === 'year-month') {
+			pathSegments.push(post.date.toFormat('LL'));
+		}
+	}
+
+	// get slug with fallback
+	let slug = getSlugWithFallback(post);
+
+	// prepend date to slug as appropriate
+	if (pathConfig.prefixDate && post.date) {
+		slug = post.date.toFormat('yyyy-LL-dd') + '-' + slug;
+	}
+
+	// use slug as folder or filename as specified
+	if (pathConfig.postFolders) {
+		pathSegments.push(slug, 'index.md');
+	} else {
+		pathSegments.push(slug + '.md');
+	}
+
+	return path.join(...pathSegments);
+}
+
+export function getFilenameFromUrl(url) {
 	let filename = url.split('/').slice(-1)[0];
 	try {
 		filename = decodeURIComponent(filename)
@@ -8,5 +81,3 @@ function getFilenameFromUrl(url) {
 	}
 	return filename;
 }
-
-exports.getFilenameFromUrl = getFilenameFromUrl;
@@ -1,5 +1,9 @@
-const turndown = require('turndown');
-const turndownPluginGfm = require('turndown-plugin-gfm');
+import turndownPluginGfm from '@guyplusplus/turndown-plugin-gfm';
+import turndown from 'turndown';
+import * as shared from './shared.js';
+
+// init single reusable turndown service object upon import
+const turndownService = initTurndownService();

 function initTurndownService() {
 	const turndownService = new turndown({
@@ -10,15 +14,17 @@ function initTurndownService() {

 	turndownService.use(turndownPluginGfm.tables);

+	turndownService.remove(['style']); // <style> contents get dumped as plain text, would rather remove
+
 	// preserve embedded tweets
 	turndownService.addRule('tweet', {
-		filter: node => node.nodeName === 'BLOCKQUOTE' && node.getAttribute('class') === 'twitter-tweet',
+		filter: (node) => node.nodeName === 'BLOCKQUOTE' && node.getAttribute('class') === 'twitter-tweet',
 		replacement: (content, node) => '\n\n' + node.outerHTML
 	});

 	// preserve embedded codepens
 	turndownService.addRule('codepen', {
-		filter: node => {
+		filter: (node) => {
 			// codepen embed snippets have changed over the years
 			// but this series of checks should find the commonalities
 			return (
@@ -30,6 +36,14 @@ function initTurndownService() {
 		replacement: (content, node) => '\n\n' + node.outerHTML
 	});

+	// <div> within <a> can cause extra whitespace that wreck markdown links, so this removes them
+	turndownService.addRule('a', {
+		filter: 'a',
+		replacement: (content) => {
+			return content.replace(/<\/?div[^>]*>/gi, '');
+		}
+	});
+
 	// preserve embedded scripts (for tweets, codepens, gists, etc.)
 	turndownService.addRule('script', {
 		filter: 'script',
@@ -73,7 +87,7 @@ function initTurndownService() {
 	// preserve <figcaption>
 	turndownService.addRule('figcaption', {
 		filter: 'figcaption',
-		replacement: (content, node) => {
+		replacement: (content) => {
 			// extra newlines are necessary for markdown and HTML to render correctly together
 			return '\n\n<figcaption>\n\n' + content + '\n\n</figcaption>\n\n';
 		}
@@ -81,12 +95,12 @@ function initTurndownService() {

 	// convert <pre> into a code block with language when appropriate
 	turndownService.addRule('pre', {
-		filter: node => {
+		filter: (node) => {
 			// a <pre> with <code> inside will already render nicely, so don't interfere
 			return node.nodeName === 'PRE' && !node.querySelector('code');
 		},
 		replacement: (content, node) => {
-			const language = node.getAttribute('data-wetm-language') || '';
+			const language = node.getAttribute('data-wetm-language') ?? '';
 			return '\n\n```' + language + '\n' + node.textContent + '\n```\n\n';
 		}
 	});
@@ -94,18 +108,16 @@ function initTurndownService() {
 	return turndownService;
 }

-function getPostContent(postData, turndownService, config) {
-	let content = postData.encoded[0];
-
+export function getPostContent(content) {
 	// insert an empty div element between double line breaks
 	// this nifty trick causes turndown to keep adjacent paragraphs separated
 	// without mucking up content inside of other elements (like <code> blocks)
 	content = content.replace(/(\r?\n){2}/g, '\n<div></div>\n');

-	if (config.saveScrapedImages) {
+	if (shared.config.saveImages === 'scraped' || shared.config.saveImages === 'all') {
 		// writeImageFile() will save all content images to a relative /images
 		// folder so update references in post content to match
-		content = content.replace(/(<img[^>]*src=").*?([^/"]+\.(?:gif|jpe?g|png|webp))("[^>]*>)/gi, '$1images/$2$3');
+		content = content.replace(/(<img(?=\s)[^>]+?(?<=\s)src=")[^"]*?([^/"]+)("[^>]*>)/gi, '$1images/$2$3');
 	}

 	// preserve "more" separator, max one per post, optionally with custom label
@@ -122,8 +134,8 @@ function getPostContent(postData, turndownService, config) {
 	// clean up extra spaces in list items
 	content = content.replace(/(-|\d+\.) +/g, '$1 ');

+	// collapse excessive newlines (can happen with a lot of <div>)
+	content = content.replace(/(\r?\n){3,}/g, '\n\n');
+
 	return content;
 }
-
-exports.initTurndownService = initTurndownService;
-exports.getPostContent = getPostContent;
@@ -1,201 +0,0 @@
-const camelcase = require('camelcase');
-const commander = require('commander');
-const fs = require('fs');
-const inquirer = require('inquirer');
-const path = require('path');
-
-const package = require('../package.json');
-
-// all user options for command line and wizard are declared here
-const options = [
-	// wizard must always be first
-	{
-		name: 'wizard',
-		type: 'boolean',
-		description: 'Use wizard',
-		default: true
-	},
-	{
-		name: 'input',
-		type: 'file',
-		description: 'Path to WordPress export file',
-		default: 'export.xml'
-	},
-	{
-		name: 'output',
-		type: 'folder',
-		description: 'Path to output folder',
-		default: 'output'
-	},
-	{
-		name: 'year-folders',
-		aliases: ['yearfolders', 'yearmonthfolders'],
-		type: 'boolean',
-		description: 'Create year folders',
-		default: false
-	},
-	{
-		name: 'month-folders',
-		aliases: ['yearmonthfolders'],
-		type: 'boolean',
-		description: 'Create month folders',
-		default: false
-	},
-	{
-		name: 'post-folders',
-		aliases: ['postfolders'],
-		type: 'boolean',
-		description: 'Create a folder for each post',
-		default: true
-	},
-	{
-		name: 'prefix-date',
-		aliases: ['prefixdate'],
-		type: 'boolean',
-		description: 'Prefix post folders/files with date',
-		default: false
-	},
-	{
-		name: 'save-attached-images',
-		aliases: ['saveimages'],
-		type: 'boolean',
-		description: 'Save images attached to posts',
-		default: true
-	},
-	{
-		name: 'save-scraped-images',
-		aliases: ['addcontentimages'],
-		type: 'boolean',
-		description: 'Save images scraped from post body content',
-		default: true
-	},
-	{
-		name: 'include-other-types',
-		type: 'boolean',
-		description: 'Include custom post types and pages',
-		default: false
-	}
-];
-
-async function getConfig(argv) {
-	extendOptionsData();
-	const unaliasedArgv = replaceAliases(argv);
-	const opts = parseCommandLine(unaliasedArgv);
-
-	let answers;
-	if (opts.wizard) {
-		console.log('\nStarting wizard...');
-		const questions = options.map(option => ({
-			when: option.name !== 'wizard' && !option.isProvided,
-			name: camelcase(option.name),
-			type: option.prompt,
-			message: option.description + '?',
-			default: option.default,
-	
-			// these are not used for all option types and that's fine
-			filter: option.coerce,
-			validate: option.validate
-		}));
-		answers = await inquirer.prompt(questions);
-	} else {
-		console.log('\nSkipping wizard...');
-		answers = {};
-	}
-
-	const config = { ...opts, ...answers };
-	return config;
-}
-
-function extendOptionsData() {
-	// add more data to each option based on its type
-	const map = {
-		boolean: {
-			prompt: 'confirm',
-			coerce: coerceBoolean,
-		},
-		file: {
-			prompt: 'input',
-			coerce: coercePath,
-			validate: validateFile
-		},
-		folder: {
-			prompt: 'input',
-			coerce: coercePath
-		}
-	};
-
-	options.forEach(option => {
-		Object.assign(option, map[option.type]);
-	});
-}
-
-function replaceAliases(argv) {
-	let paths = argv.slice(0, 2);
-	let replaced = [];
-	let unmodified = [];
-
-	argv.slice(2).forEach(arg => {
-		let aliasFound = false;
-
-		// this loop does not short circuit because an alias can map to multiple options
-		options.forEach(option => {
-			const aliases = option.aliases || [];
-			aliases.forEach(alias => {
-				if (arg.includes('--' + alias)) {
-					replaced.push(arg.replace('--' + alias, '--' + option.name));
-					aliasFound = true;
-				}	
-			});
-		});
-
-		if (!aliasFound) {
-			unmodified.push(arg);
-		}
-	});
-
-	return [...paths, ...replaced, ...unmodified];
-}
-
-function parseCommandLine(argv) {
-	// setup for help output
-	commander.program
-		.name('node index.js')
-		.version('v' + package.version, '-v, --version', 'Display version number')
-		.helpOption('-h, --help', 'See the thing you\'re looking at right now')
-		.addHelpText('after', '\nMore documentation is at https://github.com/lonekorean/wordpress-export-to-markdown');
-
-	options.forEach(input => {
-		const flag = '--' + input.name + ' <' + input.type + '>';
-		const coerce = (value) => {
-			// commander only calls coerce when an input is provided on the command line, which
-			// makes for an easy way to flag (for later) if it should be excluded from the wizard
-			input.isProvided = true;
-			return input.coerce(value);
-		};
-		commander.program.option(flag, input.description, coerce, input.default);
-	});
-
-	commander.program.parse(argv);
-	return commander.program.opts();
-}
-
-function coerceBoolean(value) {
-	return !['false', 'no', '0'].includes(value.toLowerCase());
-}
-
-function coercePath(value) {
-	return path.normalize(value);
-}
-
-function validateFile(value) {
-	let isValid;
-	try {
-		isValid = fs.existsSync(value) && fs.statSync(value).isFile();
-	} catch (ex) {
-		isValid = false;
-	}
-
-	return isValid ? true : 'Unable to find file: ' + path.resolve(value);
-}
-
-exports.getConfig = getConfig;
@@ -1,36 +1,34 @@
-const axios = require('axios');
-const chalk = require('chalk');
-const fs = require('fs');
-const http = require('http');
-const https = require('https');
-const luxon = require('luxon');
-const path = require('path');
+import axios from 'axios';
+import chalk from 'chalk';
+import fs from 'fs';
+import http from 'http';
+import https from 'https';
+import * as luxon from 'luxon';
+import path from 'path';
+import * as shared from './shared.js';

-const shared = require('./shared');
-const settings = require('./settings');
-
-async function writeFilesPromise(posts, config) {
-	await writeMarkdownFilesPromise(posts, config);
-	await writeImageFilesPromise(posts, config);
+export async function writeFilesPromise(posts) {
+	await writeMarkdownFilesPromise(posts);
+	await writeImageFilesPromise(posts);
 }

 async function processPayloadsPromise(payloads, loadFunc) {
-	const promises = payloads.map(payload => new Promise((resolve, reject) => {
+	const promises = payloads.map((payload) => new Promise((resolve, reject) => {
 		setTimeout(async () => {
 			try {
 				const data = await loadFunc(payload.item);
 				await writeFile(payload.destinationPath, data);
-				console.log(chalk.green('[OK]') + ' ' + payload.name);
+				logPayloadResult(payload);
 				resolve();
 			} catch (ex) {
-				console.log(chalk.red('[FAILED]') + ' ' + payload.name + ' ' + chalk.red('(' + ex.toString() + ')'));
+				logPayloadResult(payload, ex.message);
 				reject();
 			}
 		}, payload.delay);
 	}));

 	const results = await Promise.allSettled(promises);
-	const failedCount = results.filter(result => result.status === 'rejected').length;
+	const failedCount = results.filter((result) => result.status === 'rejected').length;
 	if (failedCount === 0) {
 		console.log('Done, got them all!');
 	} else {
@@ -43,33 +41,31 @@ async function writeFile(destinationPath, data) {
 	await fs.promises.writeFile(destinationPath, data);
 }

-async function writeMarkdownFilesPromise(posts, config) {
+async function writeMarkdownFilesPromise(posts) {
 	// package up posts into payloads
-	let skipCount = 0;
+	let existingCount = 0;
 	let delay = 0;
-	const payloads = posts.flatMap(post => {
-		const destinationPath = getPostPath(post, config);
+	const payloads = posts.flatMap((post) => {
+		const destinationPath = shared.buildPostPath(post);
 		if (checkFile(destinationPath)) {
 			// already exists, don't need to save again
-			skipCount++;
+			existingCount++;
 			return [];
 		} else {
 			const payload = {
 				item: post,
-				name: (config.includeOtherTypes ? post.meta.type + ' - ' : '') + post.meta.slug,
+				type: post.type,
+				name: shared.getSlugWithFallback(post),
 				destinationPath,
 				delay
 			};
-			delay += settings.markdown_file_write_delay;
+			delay += shared.config.writeDelay;
 			return [payload];
 		}
 	});

-	const remainingCount = payloads.length;
-	if (remainingCount + skipCount === 0) {
-		console.log('\nNo posts to save...');
-	} else {
-		console.log(`\nSaving ${remainingCount} posts (${skipCount} already exist)...`);
+	logSavingMessage('posts', existingCount, payloads.length);
+	if (payloads.length > 0) {
 		await processPayloadsPromise(payloads, loadMarkdownFilePromise);
 	}
 }
@@ -84,9 +80,25 @@ async function loadMarkdownFilePromise(post) {
 				// array of one or more strings
 				outputValue = value.reduce((list, item) => `${list}\n  - "${item}"`, '');
 			}
+		} else if (Number.isInteger(value)) {
+			// output unquoted
+			outputValue = value.toString();
+		} else if (value instanceof luxon.DateTime) {
+			if (shared.config.dateFormat) {
+				outputValue = value.toFormat(shared.config.dateFormat);
+			} else {
+				outputValue = shared.config.includeTime ? value.toISO() : value.toISODate();
+			}
+
+			if (shared.config.quoteDate) {
+				outputValue = `"${outputValue}"`;
+			}
+		} else if (typeof value === 'boolean') {
+			// output unquoted
+			outputValue = value.toString();
 		} else {
 			// single string value
-			const escapedValue = (value || '').replace(/"/g, '\\"');
+			const escapedValue = (value ?? '').replace(/"/g, '\\"');
 			if (escapedValue.length > 0) {
 				outputValue = `"${escapedValue}"`;
 			}
@@ -101,38 +113,36 @@ async function loadMarkdownFilePromise(post) {
 	return output;
 }

-async function writeImageFilesPromise(posts, config) {
+async function writeImageFilesPromise(posts) {
 	// collect image data from all posts into a single flattened array of payloads
-	let skipCount = 0;
+	let existingCount = 0;
 	let delay = 0;
-	const payloads = posts.flatMap(post => {
-		const postPath = getPostPath(post, config);
+	const payloads = posts.flatMap((post) => {
+		const postPath = shared.buildPostPath(post);
 		const imagesDir = path.join(path.dirname(postPath), 'images');
-		return post.meta.imageUrls.flatMap(imageUrl => {
+		return post.imageUrls.flatMap((imageUrl) => {
 			const filename = shared.getFilenameFromUrl(imageUrl);
 			const destinationPath = path.join(imagesDir, filename);
 			if (checkFile(destinationPath)) {
 				// already exists, don't need to save again
-				skipCount++;
+				existingCount++;
 				return [];
 			} else {
 				const payload = {
 					item: imageUrl,
+					type: 'image',
 					name: filename,
 					destinationPath,
 					delay
 				};
-				delay += settings.image_file_request_delay;
+				delay += shared.config.requestDelay;
 				return [payload];
 			}
 		});
 	});

-	const remainingCount = payloads.length;
-	if (remainingCount + skipCount === 0) {
-		console.log('\nNo images to download and save...');
-	} else {
-		console.log(`\nDownloading and saving ${remainingCount} images (${skipCount} already exist)...`);
+	logSavingMessage('images', existingCount, payloads.length);
+	if (payloads.length > 0) {
 		await processPayloadsPromise(payloads, loadImageFilePromise);
 	}
 }
@@ -141,7 +151,7 @@ async function loadImageFilePromise(imageUrl) {
 	// only encode the URL if it doesn't already have encoded characters
 	const url = (/%[\da-f]{2}/i).test(imageUrl) ? imageUrl : encodeURI(imageUrl);

-	const config = {
+	const requestConfig = {
 		method: 'get',
 		url,
 		headers: {
@@ -150,70 +160,44 @@ async function loadImageFilePromise(imageUrl) {
 		responseType: 'arraybuffer'
 	};

-	if (!settings.strict_ssl) {
+	if (!shared.config.strictSsl) {
 		// custom agents to disable SSL errors (adding both http and https, just in case)
-		config.httpAgent = new http.Agent({ rejectUnauthorized: false });
-		config.httpsAgent = new https.Agent({ rejectUnauthorized: false });
+		requestConfig.httpAgent = new http.Agent({ rejectUnauthorized: false });
+		requestConfig.httpsAgent = new https.Agent({ rejectUnauthorized: false });
 	}

-	let buffer;
-	try {
-		const response = await axios(config);
-		buffer = Buffer.from(response.data, 'binary');
-	} catch (ex) {
-		if (ex.response) {
-			// request was made, but server responded with an error status code
-			throw 'StatusCodeError: ' + ex.response.status;
-		} else {
-			// something else went wrong, rethrow
-			throw ex;
-		}
-	}
+	const response = await axios(requestConfig);
+	const buffer = Buffer.from(response.data, 'binary');
+
 	return buffer;
 }

-function getPostPath(post, config) {
-	let dt;
-	if (settings.custom_date_formatting) {
-		dt = luxon.DateTime.fromFormat(post.frontmatter.date, settings.custom_date_formatting);
-	} else {
-		dt = luxon.DateTime.fromISO(post.frontmatter.date);
-	}
-
-	// start with base output dir
-	const pathSegments = [config.output];
-
-	// create segment for post type if we're dealing with more than just "post"
-	if (config.includeOtherTypes) {
-		pathSegments.push(post.meta.type);
-	}
-
-	if (config.yearFolders) {
-		pathSegments.push(dt.toFormat('yyyy'));
-	}
-
-	if (config.monthFolders) {
-		pathSegments.push(dt.toFormat('LL'));
-	}
-
-	// create slug fragment, possibly date prefixed
-	let slugFragment = post.meta.slug;
-	if (config.prefixDate) {
-		slugFragment = dt.toFormat('yyyy-LL-dd') + '-' + slugFragment;
-	}
-
-	// use slug fragment as folder or filename as specified
-	if (config.postFolders) {
-		pathSegments.push(slugFragment, 'index.md');
-	} else {
-		pathSegments.push(slugFragment + '.md');
-	}
-
-	return path.join(...pathSegments);
-}
-
 function checkFile(path) {
 	return fs.existsSync(path);
 }

-exports.writeFilesPromise = writeFilesPromise;
+function logSavingMessage(things, existingCount, remainingCount) {
+	shared.logHeading(`Saving ${things}`);
+	if (existingCount + remainingCount === 0) {
+		console.log(`No ${things} to save.`);
+	} else if (existingCount === 0) {
+		console.log(`${remainingCount} ${things} to save.`);
+	} else if (remainingCount === 0) {
+		console.log(`All ${existingCount} ${things} already saved.`);
+	} else {
+		console.log(`${existingCount} ${things} already saved, ${remainingCount} remaining.`);
+	}
+}
+
+function logPayloadResult(payload, errorMessage) {
+	const messageBits = [
+		errorMessage ? chalk.red('✗') : chalk.green('✓'),
+		chalk.gray(`[${payload.type}]`),
+		payload.name
+	];
+	if (errorMessage) {
+		messageBits.push(chalk.red(`(${errorMessage})`));
+	}
+
+	console.log(messageBits.join(' '));
+}