Files
wordpress-export-to-markdown/README.md
T

156 lines
4.9 KiB
Markdown
Raw Normal View History

2018-10-25 17:09:16 -04:00
# wordpress-export-to-markdown
2018-10-14 18:46:24 -04:00
2020-01-18 14:57:18 -05:00
A script that converts a WordPress export XML file into Markdown files. Useful if you want to migrate from WordPress to a static site generator ([Gatsby](https://www.gatsbyjs.org/), [Hugo](https://gohugo.io/), [Jekyll](https://jekyllrb.com/), etc.).
2018-10-25 17:09:16 -04:00
2020-01-18 14:57:18 -05:00
Each post is saved as a separate Markdown file with appropriate frontmatter. Images are also downloaded and saved. Embedded content from YouTube, Twitter, CodePen, etc. is carefully preserved.
2018-10-26 16:07:22 -04:00
2018-10-25 17:09:16 -04:00
## Quick Start
2018-12-12 16:28:15 -05:00
You'll need:
2020-01-14 13:56:28 -05:00
- [Node.js](https://nodejs.org/) v12.14 or later
2018-12-12 16:50:21 -05:00
- Your [WordPress export file](https://codex.wordpress.org/Tools_Export_Screen)
2018-10-25 17:09:16 -04:00
2020-01-18 14:57:18 -05:00
You can run this script immediately in your terminal with `npx`:
2018-10-25 17:09:16 -04:00
2020-01-18 14:57:18 -05:00
```
npx wordpress-export-to-markdown
```
2018-10-25 17:09:16 -04:00
2020-01-18 14:57:18 -05:00
Or you can clone and run (this makes repeated runs faster and allows you to tinker with the code). After cloning this repo, open your terminal to the package's directory and run:
2018-10-25 17:09:16 -04:00
```
2020-01-18 14:57:18 -05:00
npm install
node index.js
2018-10-25 17:09:16 -04:00
```
2020-01-18 14:57:18 -05:00
Either way you run it, the script will start the wizard. Answer the prompts and off you go!
## Command Line
The wizard makes it easy to configure your options, but you can also do so via the command line if you want.
For example, the following will give you [Jekyll](https://jekyllrb.com/)-style output in terms of folder structure and filenames.
Using `npx`:
2020-01-14 13:56:28 -05:00
```
2020-01-18 14:57:18 -05:00
npx wordpress-export-to-markdown --post-folders=false --prefix-date=true
2020-01-14 13:56:28 -05:00
```
2020-01-18 14:57:18 -05:00
Using a locally cloned repo:
2020-01-14 13:56:28 -05:00
```
2020-01-18 14:57:18 -05:00
node index.js --post-folders=false --prefix-date=true
2020-01-14 13:56:28 -05:00
```
2020-01-18 14:57:18 -05:00
The wizard will still prompt you for any options not specifed on the command line. To skip the wizard entirely and use default values for unspecified options, add `--wizard=false`.
2020-01-14 13:56:28 -05:00
## Options
### Use wizard?
- Argument: `--wizard`
- Type: `boolean`
- Default: `true`
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
Enable to have the script prompt you for each option. Disable to skip the wizard entirely and use default values for any options not specified via the command line.
### Path to input file?
- Argument: `--input`
- Type: `file` (as a path string)
2018-10-25 17:09:16 -04:00
- Default: `export.xml`
2020-01-14 14:22:00 -05:00
The path for the file to parse. This should be the WordPress export XML file that you downloaded. The easiest thing to do is drop your `export.xml` file into the script's directory and use the default value for this option.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Path to output folder?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--output`
- Type: `folder` (as a path string)
2018-10-25 17:09:16 -04:00
- Default: `output`
2020-01-14 14:22:00 -05:00
The path for the output directory where Markdown and image files will be saved. If it does not exist, it will be created for you.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Create year folders?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--year-folders`
- Type: `boolean`
2018-10-25 17:09:16 -04:00
- Default: `false`
2020-01-14 13:56:28 -05:00
Whether or not to organize output files into folders by year.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Create month folders?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--month-folders`
- Type: `boolean`
2018-10-25 17:09:16 -04:00
- Default: `false`
2020-01-14 13:56:28 -05:00
Whether or not to organize output files into folders by month. You'll probably want to combine this with `--year-folders` to organize files by year then month.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Create a folder for each post?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--post-folders`
- Type: `boolean`
2018-10-25 17:09:16 -04:00
- Default: `true`
Whether or not to save files and images into post folders.
If `true`, the post slug is used for the folder name and the post's Markdown file is named `index.md`. Each post folder will have its own `/images` folder.
2020-01-14 13:56:28 -05:00
/first-post
/images
potato.png
index.md
/second-post
/images
carrot.jpg
celery.jpg
index.md
2018-10-25 17:09:16 -04:00
If `false`, the post slug is used to name the post's Markdown file. These files will be side-by-side and images will go into a shared `/images` folder.
2020-01-14 13:56:28 -05:00
/images
carrot.jpg
celery.jpg
potato.png
first-post.md
second-post.md
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
Either way, this can be combined with with `--year-folders` and `--month-folders`, in which case the above output will be organized under the appropriate year and month folders.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Prefix post folders/files with date?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--prefix-date`
- Type: `boolean`
2018-10-25 17:09:16 -04:00
- Default: `false`
Whether or not to prepend the post date to the post slug when naming a post's folder or file.
2020-01-14 13:56:28 -05:00
If `--post-folders` is `true`, this affects the folder.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
/2019-10-14-first-post
index.md
/2019-10-23-second-post
index.md
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
If `--post-folders` is `false`, this affects the file.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
2019-10-14-first-post.md
2019-10-23-second-post.md
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Save images attached to posts?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--save-attached-images`
- Type: `boolean`
2018-10-25 17:09:16 -04:00
- Default: `true`
2020-01-14 13:56:28 -05:00
Whether or not to download and save images attached to posts. Generally speaking, these are images that were added by dragging/dropping or clicking **Add Media** or **Set Featured Image** when editing a post in WordPress. Images are saved into `/images`.
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
### Save images scraped from post body content?
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
- Argument: `--save-scraped-images`
- Type: `boolean`
- Default: `true`
2018-10-25 17:09:16 -04:00
2020-01-14 13:56:28 -05:00
Whether or not to download and save images scraped from <img> tags in post body content. Images are saved into `/images`. The <img> tags are updated to point to where the images are saved.