# clrmame
Little tool to scan and rebuild MAME (https://www.mamedev.org/) machine sets.

(c) 2022 - 2026 Roman Scherzer

This software is freeware.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

For (c) and licence information of the used 3rd party tools, please refer
to the bottom of this document.


## The story so far

Another clrmame?

Well, actually you might know that I started clrmame(pro) in 1997. Back then
the MAME world was different, the C++ world was different. clrmamepro grew over
decades while MAME added things like using crc32, md5, sha1, merge modes, chds,
devices, xml and so on. And then there were other projects which also wanted to
use clrmamepro for their purposes and so lots of requests were added.

The downside of this: Maintaining the code is pretty tough. I'm pretty bored
looking at that old code basis, so it is time for something new. It feels a bit
like 1997 again, code something for me, for fun. It's fun again to see that
'modern' C++ and 3rd party tools make life way easier. Less options, faster, no
MS Windows specific coding, complete new ideas how to keep data in memory or
how to handle different merge modes. Maybe some people will enjoy it as much as
I do.

For now, the scanner and rebuilder focus is on the core, not the UI. It's
fine for using them on let's say one or two setups which more or less cover the
use cases of MAME, i.e. MAME's -listxml and MAME's -listsoftware machine
collections. They can be handled rather conveniently with the UI versions.

A real profiler which gives you better overview over several setups is planned.


So, if you're not bored yet, read on.


## Small disclaimer:

You as the user are responsible what you do, especially when it comes to
removal of files or enabled fix options. If you use a wrong setup and e.g.
scan folders on your system which have nothing to do with the files from the
loaded xml holding the machine definitions, they are marked as unneeded and
get removed.

You use this software on your own risk. If you don't use the program
in a right way, the chance of data loss is given. I highly advice you to double
check your settings before you do a fix operation or work with removal
operations and of course: no backup - no mercy. I'm not responsible for any
kind of data loss which may be caused by using this software.


## What do you get?

There is no install wizard. You simply unpack the archive holding several
files, including this readme. It's recommended to use an own folder to store
the unpacked files. The folder should NOT be UAC protected (like e.g.
C:\program files). Use e.g. D:\clrmame or something similar. The reason is
that the programs create some folders or the settings.xml file where the exe
files are placed. In an UAC protected folder you'd need administration rights
for this.

The package currently contains two exe files. One for the UI version and one
for the commandline version or clrmame (currently a Rebuilder, Scanner
and Dir2Dat collection). Each exe file can run on its own, the only shared
files are the settings and the 7z.dll which is required for 7z archive support.

The tools are built for users which are familiar with MAME. You might read/hear
terms which are commonly used in the MAME environment when it comes to storing
machine sets. You should als be aware that playing/using machine sets in MAME
is something else than auditing them. A machine which was scanned without
errors doesn't necessarily work in MAME, e.g. due to missing or incomplete
emulation. Also keep in mind that machines in MAME often need dependency
machines (parent, bios and/or device machines).


## Quick Start

I know clrmamepro for years or I don't care about further reading, so how can I
quickly dive into the tools?


If you want an user interface, you run clrmameUI.exe. Currently you can select
either a scan or a rebuilder module via a tab control at the top of the window.

You need to fill in some mandatory information:

**Scanner:**

- XML / EXE - you need to enter a full filename of the datfile you want to use or
point to a MAME executable plus list its export function.
e.g. "F:\MAMEEmu\mame.exe -listxml" or "c:\temp\266.xml"
- Rom Paths - you need to enter your rom path. If you use multiple ones, separate
them by ;
- Sample Paths - in case you also have samples, enter your sample path(s) here.
- Backup Path - you need to specify a backup folder.
- Mode - specify your used merge mode.

~~~ There is one major difference when you are used to use full merged sets.
In full mode, machines are now organized in subfolders, i.e. the parent machine
is in the archive root, each clone got an own subfolder within the archive.
This storing mechanism was also possible in clrmamepro (Settings->Full Merge
Mode-> Hash Collision Name) but it wasn't the default. When you scan your old
clrmamepro audited full merged set you will see a lot of wrong named files
within the report. Such warnings are all complains about clone files within the
parent archive which are not in a subfolder.
~~~

- Hit the New Scan button and wait till the result is shown.


**Rebuilder:**
Run  rebuilderUI.exe and fill in / select the following:

- XML / EXE - you need to enter a full filename of the datfile you want to use or
point to a MAME executable plus list its export function.
e.g. "F:\MAMEEmu\mame.exe -listxml" or "c:\temp\266.xml"
- Input Paths - you need to enter your input path. If you use multiple ones,
separate them by ;
- Output Path - you need to enter your output path.
- Backup Path - you need to specify a backup folder.
- Mode - specify your preferred merge mode for the output files
- Compress - specify your preferred compression type for the output files

- Hit the Rebuild button and wait till the result is shown.


## Scanner

The scanner runs through the specified rom and sample paths and checks the
files in there. You should follow one of the following storing methods for your
rom/sample machine sets:

`rompath\machineName\rom file 1...n` for decompressed sets

`rompath\machineName.zip` (or .7z) for compressed sets, where the archive holds
rom file 1..n`

In case of pattern usage (see below) you can use:

`rompath\pattern\machineName\rom file 1...n` for decompressed sets

`rompath\pattern\machineName\machineName.zip` (or .7z) for compressed sets,
where the archive holds rom file 1..n

Disk files (.chd) are always stored decompressed and also follow the storing
method `rompath\machineName\disk file 1...n` or
`rompath\pattern\machineName\disk file 1...n`

Same applies to the sample folder. If you don't specify a sample folder,
samples won't be scanned.

The scanner can detect unneeded files, wrongly named files (case sensitive),
missing files. Files can be rom files, sample files and chd files or the
archive themselves holding rom and sample files. Wrong sized files or files
with hash differences are handled as unneeded.

Unless you use the fix options, you can safely scan your folders. Nothing will
be touched in that case.

**Beware: Any file which does not represent a file specified in the loaded xml
file is handled as unneeded.** This is also valid for files from
the xml which don't follow the storing method. However, the scanner is able to
see if it is not in place and moves it there. Generally any unneeded file
(which is not just wrongly placed) will be put to the backup folder.

The scan operation is divided into several phases where it e.g. scans the
single files, collects information about preferred storing paths or updates the
missing information. Finally a fixing phase (when enabled) is done and in the
end it generates an output. Fixing is done in-place whereever possible. When
a new machine set needs to be generated or a wrongly placed one is moved, the
scanner tries to use the best place for it. This only plays a role when you use
more than one rompaths or you use patterns.

The scanner creates an output report file in its scan folder which shows
which issues were detected. In case of an enabled fix option, fixed problems
are not listed, so the report will only hold remaining problems. A fixdat is
also created in the fixdat folder. A fixdat is a xml file (which can be used
inside the scanner/rebuilder or clrmamepro which only holds missing items.
Typical use of such a file is to share it with other people and they help you
to find the missing files for you.
If you're using the UI version, the report file is visualized in a tree like
output.

What can you do about missing files in the report? Well, first of all you need
them somewhere. The scanner can't magically create missing files out of
nothing. However, it can find files for you. The scanner by default looks up
missing files inside your collection and it also looks in the specified backup
folder if you got the "Incl. as Add Path" option enabled (--ba). So if a file
was removed in the past and gets needed again in the future, it will be found.
Additionally you can specify 'addpaths' which then will be included in the
search for missing files. Files inside the backup and addpaths won't be
removed if a match was found and added to the rompath.

The scanner has two scan modes. The 'new scan' loads the xml file and scans all
machines in your paths. The second mode ('scan') tries to reuse scan
information and limits the scan to the machines which had issues previously.
The information can only be used if the settings and underlying xml data
are the same. So a typical method when a new MAME is out, you do a full new
scan (underlying xml data is new, so no choice anyhow). After that you may
have some missing files here and there. If you want to scan again, you can
use the second scan method and it will be way faster since it will only
rescan the machines which had problems before.


## Rebuilder

The rebuilder analyzes each file (or file in an archive) in the input folder
and tries to match each file's hash/size against the loaded database. Each
found match (there can be multiple, think e.g. of a file which is shared by 10
machines) will be added to the output folder. There it uses the correct file
name and machine name. Optionally, rebuilt files (and empty folders) are
removed from the input folder. If your input folder holds an archive with an
archive or chd inside, this inner archive is unpacked to your temporary folder.
Such temporary files are removed at the end of a rebuilding process.
By default the system's temporary folder is used. You can alter the temporary
folder by modifying the "TempFolder" entry in settings.xml. This can
be very useful if your system's temporary folder is on a slower disk. Selecting
a custom temporary folder can have a positive effect on rebuilding speed.

By default, the rebuilder matches files by crc32/sha1 and size. There is an
option (-s, --sha1) which allows you to select between no sha1, input, output
and both sha1 checks. Enabling one sha1 mode is of course slower since it needs
additional time for decompressing the archived data and the actual hash
calculation. Surely it's more accurate. In case there are different sha1 values
for one crc32 specified in the datfile and you disabled the input sha1 check,
it will take the sha1 value into account, too. 

This rebuilder is also able to rebuild CHD files (from version 3 onwards). For
such files, the sha1 information from the CHD header is used for a match.

If you already have an existing archive/folder in the destination, the matched
file is added but only if the file itself does not exist there. If there is an
existing file but it does not match the right hash, this existing file is moved
to the backup folder and the found match replaces it in the rebuilder
destination. The backup folder needs to be specified.

This rebuilder can also identify source archives which already match an output
machine completely. If the destination does not exist yet, such archives are
copied directly.

The rebuilder runs through various phases. First it loads the datfile, checks
source/destination paths for existance and builds merge mode views. Then it
runs through the input folder. Be patient, this can take some time depending,
especially if you scan lots of files. After that the output is checked. If no
output exists or contains not many files, this should be very fast. An optimize
phase is next and -if possible- the rebuilder then copies archives which can be
copied directly. Finally, the actual rebuild is done and in the end a cleaning
step is performed.

## Commandline versus UI

In the following the options are described based on their commandline
equivalent. If you're not a typical commandline user, don't worry, the options
correspond nearly 1:1 to checkboxes/comboboxes/inputfields which can be found
in the UI.


## Shared commandline options

In general (and this also includes dir2dat described later in this readme), the
commandline version also offers the usage of configuration files either in
typical ini or toml format.

**--config**
Here you can optionally define a configuration file which holds your settings.
Subcommands (scan, rebuild, dir2dat) have their own subsection. So for example
if you want to run dir2dat with the --input and --output options you can have
a file test.toml containing:

[dir2dat]\
input = "e:\\\\temp\\\\test"\
output = "e:\\\\temp\\\\test.xml"

and you can run clrmame.exe dir2dat --config test.toml

(not the \\\\ for path delimiters). Input and output refer to the --input and
--output options.


The following commandline options are used in the scanner and the rebuilder.

**-x, --xml**:
Here you specify your used xml file which holds your databasis. This option is
mandatory. See below for the supported types of XML files.

**--bp, --backuppath**
A backup folder which is used in case when an output file already exists and
differs from the one which will replace it. The existing file will be put in
the backup folder. There it uses a similar path as in the output and it uses
the same storage method (compressed or plain file). For the scanner, unneeded
files which are not part of the loaded data will also be placed in the backup
folder and during the scan. The backup folder is also scanned (but not touched)
for possible fixing of missing files in the rompaths.

**-m, --mode**:
Your preferred output merge mode. See below what the 3 modes are. This is an
optional setting, default value is **split**. You can use **split**, **full**
, **standalone** or **nonmerged**.
               
**-p, --pattern**:
With this option you can specify an output pattern. Basically a path
information which is put in between the output folder and the machine name. See
below for examples.

**-f, --filter**:
Here you can specify a regular expression on the machine name. When you prefix
your filter with "xp:" you can also use an xpath expression to filter on
anything inside the xml. You only need to care that your xpath selects a machine
node in the end. In general, only matching entries from the loaded xml will then
taken into account during rebuild. This is optional and no filter is the default
value. Use a prefix of "file:" and you can list a textfile path which holds
single entries (one per row) to limit your entries. "available:" prefix reads
your rom and sample paths and uses the found machine names as filter.  You might
notice a count difference in total vs filtered even when you got all files.
This is based on the fact that a) empty sets are excluded and b) there are
clones which are totally included in their parent, so -in split mode- you don't
have a standalone file/folder for it. So don't worry about the count. The option
is only available in the scanner, not in the rebuilder.

**-l, --loglevel**:
Specify the detail level of the output. By default this is set to **info**. You
can use **err**, **warn**, **info** or **trace** where the latter one
additionally lists source file and rebuild information. **info** shows you a
little progress bar here and there and gives you some updates when reading
folders. If you redirect your ouput, progress bars and updating file counts are
not visible.

**--threads**:
The maximum number of threads used for scanning. Not specifing this option or
using 0 will automatically determine a good value. Avoid high values when you
scan lots of decompressed machines due to file seek overhead which slows down
the process.
If you use a high number, scanning and rebuilding works on various files in
parallel. This can immensly speed up the process. However, depending on your
hardware (e.g. HD versus SDD) and on the file sizes and how they are stored
on your device, parallel access can be a problem, especially if it comes to
lenghty operations like reading the files for calculating hash information.
The auto mode of this option uses a high value when using compressed files.
Usually only a small header of compressed files are read where hash information
is stored. When working on not compressed files, the auto mode will pick a
small value for the threads because otherwise your device will cause lots of
file seek operations. If you're not sure what to use, stay with the auto option.
If auto picks a small value, you can play around with it and test yourself if
increasing speeds up or slows down the process.

**--cl, --compressionlevel**:
Specifies the compression level which is used for 7z and zip compression. It
can be one of **best** (default), **normal**, **fast** or **store**. Keep in
mind, best maps to 7z ultra or a very high zstd compression level which can
be rather time intense but surely gives very good compression ratios.

**--cm, --compressionmethod**:
Specifies the compression method when compressing zip files. It can be
**deflate** (default) or **zstd**.


## The scanner specific commandline options

To use the scanner, you need to use the "scan" keyword.

The scanner allows the following additional commandline parameters:

**--rp, --rompath**
The path(s) where you store your roms and disks (chds). This is a required
option. Rompaths have to exist, can't be a subpath of rom/sample/backup or
addpaths.

**--sp, --samplepath**
The path(s) where you store your samples. If not specified, samples are not
scanned and not reported as missing. Multiple sample paths can be separated by
semicolon (;). Machines inside the sample path follow the same storing rules
(merge mode, placement) as in roms in the rompaths). Samplepaths can't be a
subpath of rom/sample/backup or addpaths.

**--ap, --addpath**
An optional setting where you can specify one or more paths which are used to
find missing items from your rompaths. Addpaths are not touched, so a found
matching item is not removed but copied/added to your rompath. Addpaths can't
be a subpath of rom/sample/backup or addpaths.

**-s, --sha1**:
For the scanner this is only a flag and when set, files are not only matched by
crc32 and size but also by sha1. This usually requires a decompression of
archived files to memory to calculate the sha1 hash on the data which in the
end results in a slower scan operation.

**-n, --new**:
When setting this flag, the scanner will always do a new scan, i.e. it scans
all files in the paths and not only the ones which had an issue in a previous
scan. A new scan is also done if no previous data is available, even if this
flag is not set.

**--fm, --filtermode**:
Your preferred filtermode. It specifies how machines which do not matched your
filter expression are handled. By default this value is set to **soft**. You
can use **soft** or **hard**. In **soft** mode, such unmatched machines are
simply ignored. In **hard** mode, they are marked as unneeded.

**--fix**:
Setting this flag will enable the actual fixing phase after the scan. If not
specified, no files will be altered, moved or even deleted.

**--um, --unneededmask**:
If you want to ignore specific files which are found as unneeded, you can use
a regular expression here to skip such files.

**--ba, --backupasaddpath**:
If enabled, the backup folder is also searched for missing entries (like
add paths).

**--ar, --addpathremove**:
If enabled, files which can be added to your rompath via addpaths (or backup,
see --ba) will be removed. This refers to all matching files. If there is a
matching file (no matter if it is used for fill in a missing file in the rompath
or it already exists there) in the add/backup paths, it will be removed from
its origin (add/backup path). Removal of files is only done when --fix is
enabled, too.

**--chdversion**:
Required chd version. Default is 5, -1 turns check off.

**--chdversionbaddumps**:
If enabled, chd version checks are also performed on bad dump chds.


## The rebuilder specific commandline options

To use the rebuilder, you need to use the "rebuild" keyword.

The rebuilder allows the following additional commandline parameters:

**-i, --input**:
The rebuilder input folder(s). This/These folder(s) are checked for matching
data. This option is mandatory and the folder(s) has/have to exist. Multiple
paths need to be separated by semicolon (;).

**-o, --output**:
The rebuilder output folder. In this folder, the matched data is copied/moved
to. This option is mandatory. If the folder does not exist, it will be created.
Note: When using -r, --recursive, the rebuilder output path can't be a
subfolder of any of the rebuilder input path(s).

**-c, --compress**:
This defines your preferred output compression method. This is an optional
setting. Default is **zip** which keeps your files in zip archives. You can use
**zip**, **rezip**, **7z** or **none**. The latter one would keep your machines
decompressed. **rezip** will always recompress the destination files and a
direct copy of archives is not performed.

**-d, --delete**:
With this option, rebuilt input files are removed. Be warned, they are gone! If
the last file from an input archive or folder is removed, this archive/folder
is also removed. Deletion is optional and disabled by default.
                
**-s, --sha1**:
This would turn on or off additional sha1 matching of input and/or output
files. Enabling it will be slower but more accurate. Default is **input**. You
can use **none** to turn on simple crc32/size checks, **input** to do sha1
checks on the input file only, **output** to do sha1 checks on a possibly
existing output file only and **both** which is identical to **input** and
**output**. **-s none** is the fastest mode but keep in mind, depending on the
files you're scanning, you may run into crc32 matches where in fact the sha1
would not match.

**-r, --recursive**:
Turn this on if you want to run through your input folder and all of its
subfolders.

**-u, --uselinks**
Default is **none**. **hard** or **sym** are possible other values.
Turn this on if you want to generate a filesystem hard or sym link instead
doing a file copy operation. This takes place when copying archives 1:1,
copying chds or copying single unpacked files from a source to the target. Keep
in mind that there are limitations to the use of links in general and based on
its type. This includes volume restrictions and access rights.

## Supported XML file types (-x, --xml)

Currently three types of xml files are supported:

- MAME -listxml XML output which holds machines (devices, bios) with roms,
disks and samples.

- MAME -listsoftware XML output which holds a software list collection, i.e.
several machines with roms and disks for several software lists (e.g. a2600
machines, c64 machines and so on)

- MAME software list XML. Such files can typcially be found in MAME's hash
folder. They hold machines with roms and disks for one specific software list.
  
You can create a xml file by running MAME from the commandline interpreter and
redirect its output, e.g. mame.exe -listxml >266.xml.
The UI versions also support specifying a binary plus the export parameter
directly, so you can e.g. enter `c:\mame\mame.exe -listsoftware` in the
scanner or rebuilder xml input field.

When loading an input file you might see some warnings. For a standard MAME
-listxml you e.g. see sample specific warnings. It's mainly about sample
relationships from machines to a sample parent machine which is not available
in the XML. Such sample-only sets are generated automatically so that the
assignment is correct again. Similar warnings exist for the use of samples
which aren't available in the sample parent set. This is also fixed internally.


## XML, Input, Output, RomPaths

The four things you need to specify are

Scanner:
- the XML data basis (the 'datfile'), this defines all the machines
- your rompaths, one or more folders where your roms are stored.
- a backup path where replaced and removed files can be secured
- optionally: your sample paths

Rebuilder:
- the XML data basis (the 'datfile'), this defines all the machines
- your rebuilder source, a folder which is scanned for possible matches
- your rebuilder destination, a folder where found matches are being added to
- a backup path where replaced files can be secured


Examples:

Load a MAME -listxml xml and scan from f:\roms:
`clrmame.exe scan -x e:\MAME\244.xml -rp f:\roms -bp c:\backup`

Load a MAME -listxml xml and scan and fix all issues from f:\roms and f:\samples
`clrmame.exe scan -x e:\MAME\244.xml -rp f:\roms -sp f:\samples --fix -bp c:\backup`

Load a MAME software list xml and rebuild from C:\Users\FooBar\Downloads to
f:\softwarelists\a2600_cas:
`clrmame.exe rebuild -x e:\MAME\hash\a2600_cass.xml -i C:\Users\FooBar\Downloads -o
f:\softwarelists\a2600_cass -bp c:\temp`

Load a MAME -listxml xml and rebuild from f:\roms and all its subfolders to
f:\mame\roms:
`clrmame.exe rebuild -x e:\MAME\244.xml -r -i f:\roms -o f:\mame\roms -bp c:\temp`

Load a MAME -listxml xml and rebuild from f:\roms and all its subfolders to
f:\mame\roms and remove all rebuilt files:
`clrmame.exe rebuild -x e:\MAME\244.xml -r -i f:\roms -o f:\mame\roms -d -bp c:\temp`


Rebuilding is more or less a copy operation of files. If we talk about CHDs,
we even talk about huge files. If your input and output folders are on the
same ssd/hd, you will create pretty much I/O traffic. Ever tried copying (not
moving) a multi GB file on one and the same hd? It usually crawls. So be aware
of this before you try to rebuild a complete MAME collection from one folder
to another.

Scanner speed can differ. It depends on how many files you have, which media
you use to store data and of course your memory/cpu power. Keep in mind
that e.g. for a current MAME the scanner has to check over roughly 40000 files.
A modern system can keep a full scan in diskcache so that a second run will
be way faster than an initial one.


## Modifying the output (-p, --pattern, -c, --compress)

For the rebuilder you can define how the output should be stored. Either
compressed (currently as zip) or decompressed as files and folders.

Create decompressed output: `-c none`

Create zip archives: `-c zip`

Create 7z archives:  `-c 7z`

Alway recompress zip archives: `-c rezip`

The scanner determines the compression mode by looking at your existing
collection. It sees if you use decompressed, zip or 7z machines and uses this
as its default. If no automatic determination is possible, it uses zip.

There is another option to modify the output by defining some patterns which
can be used to add additional folders to the output: `-p sub1/sub2/sub3`

This will add three level of sub folders to the given rebuilder output root or
scanner rom/sample paths. Assuming you specified e:/temp as output folder, your
machine sets will then be placed in e:/temp/sub1/sub2/sub3. While folder
separator characters (/ \) are allowed, . or .. are not possible to be used
here.

Way more interesting are predefined patterns which can be used with the -p
command. You can use:

- **\#FIRSTCHAR\#** - uses the first character of the machine's name
- **\#YEAR\#** - uses the year information from the machine (e.g. 1997)
- **\#MANUFACTURER\#** - uses the manufacturer / publisher information from the
machine (e.g. Sega)
- **\#TYPE\#** - this uses 'device' for "isdevice=yes" machines,
'mechanical' for "ismechanical=yes" machines, 'bios' for "isbios=yes"
machines and
'default' for the rest. So you can split up your collection by type somehow.
Bios has a higher priority than mechanical by the way.
- **\#BIOSSPLIT\#** - Similar to type but this additional splits up machines by
bios ('bios_' + biosname).
So for example the neogeo bios and all sets having a dependency on the neogeo
bios are put in 'bios_neogeo'.
- **\#SOFTLIST\#** - uses the software list name of the machine (e.g. a2600)
- **\#BIOS\#** - uses the bios name of the machine (e.g. neogeo)

Example:
You want to split up your collection by manufacturer and by year:
`-p #MANUFACTURER#/#YEAR#`.
You want to split up your collection by something which is nearly identical to
clrmamepro's system default paths: `-p #BIOSSPLIT#`

If you're not happy with the prefix names (e.g. #device), you can alter them in
the TypeNames elements in settings.xml

If you're loading a software list collection datfile, you automatically have a
pattern of \#SOFTLIST\# active internally as top level. For scanning a sofware
list collection it would mean that you don't need to specify a pattern and you
only need to setup one rompath which points to the parent of the 3do_m2, 32x,
a2600, etc. folders.


## Merge Modes (-m, --mode)

A merge mode defines how your stored mechanines are bundled. Some machines
share a parent / clone relationship which is specified in the underlying
datfile. Depending on the chosen mode, such machines can be merged together.

We differ between:

- **split** (default) - Clone machines will only hold the files for the
specific clone, while the shared files stay in the folder/archive of the parent
machine. Parents or machines without any parent/clone relationship exist as
they are. Indicator for a machine rom/disk to be part of the parent set is the
merge attribute in the xml data. There can be clones which are 100% identical
to their parent machine. In such a case the clone does not exist, only the
parent does.

- **full** - Clone archives/folders don't exist (so you only have them for the
parent sets or sets which don't have a parent/clone relationship). The clones
are kept inside the parent archive/folder, too. They however use subfolders in
there, named after their machine name. So for example '10yard85' is a clone of
'10yard'. Only '10yard' exists and inside the archive or folder you have the
roms of the parent on top level and a subfolder named '10yard85' with the roms
of the clone. Subfolders only exist in compressed sets. MAME can't detect files
in decompressed sets, so for decompressed modes (-c none), the folders are
flat, i.e. there are no clone subfolders. Be aware that in case of a hash
collision (same filename within a parent/clone relationship but different
hashes), you will only have one file. For CHDs which are always stored in a
decompressed way, the same applies.

- **standalone** - If you want to have one archive/folder which holds all
needed files for a machine, this is the mode you want. It will not only include
the parent files for a clone machine but also all other dependencies like bios
and devices. Each are kept in subfolders named after their machine
(device/bios) name (similar to the 10yard85 example above). If using
decompressed sets, they are stored in a flat folder, i.e. no subfolders exist.

- **nonmerged** - This mode has all roms of a given machine and ignores
parent/clone relationships. So a clone of some machine also includes all
listed parent files, too. So it's a bit like standalone but it does not
include dependencies like devices or bios files. No subfolders are used here.

If you load a datfile which doesn't have any parent/clone relationships you
can ignore the merge mode since it won't affect your output.


## Limit your output (-f, --filter, -fm, --filtermode)

With the -f, --filter option you can filter the loaded XML to a subset of
machines. You define a regular expression which is matched against the
machines name. So for example if you only want to rebuild or scan "pacman", you
can simply add:
`-f pacman`

If you want to rebuild all machines which start with 'pac', you can write:
`-f pac.*`

For filtering only pacman and outrun, you'd write: `-f pacman|outrun`

Using xpath filterings gives you more power. To use it, you need to prefix the
filter string with "xp:", otherwise it's the regular expression filter and your
xpath expression needs to select software, machine or game elements.

Filtering a -listsoftwarelist output by only taking the Commodore software
lists into account:

`xp://softwarelist[contains(@description, 'Commodore')]/software`

You love Commodore and Atari, well, you can filter on them:
`xp://softwarelist[contains(@description, 'Atari') or contains(@description,
'Commodore')]/software`


Filtering a -listxml output by selecting machines which have a baddump disk:

`xp://machine[disk[@status='baddump']]`

Filtering a -listxml output by selecting machines which have preliminary
emulation status and Taito as manufacturer:

`xp://machine[driver[@emulation='preliminary'] and manufacturer='Taito']`

You can also use a textfile which holds one plain machine name per row
to filter your xml. In this case you need to prefix the filter string with
"file:". In case you're scanning a software list collection file, you need
to prefix the single entries with the software list name, e.g.
a2600#defender and a5200#defender.

"available:" (only available in scanner), scans your rom and sample paths
for matching machine names and uses that as filter list.

With the filter mode, you define how unmatched machines are handled.
By default, unmatched machines are ignored. The scanner won't take them into
account when looking for fixable missing files either. The hard filter
mode is different. **Beware: When setting filter mode to hard, machines
which do not match your filter are handled as unneeded in the scanner!**

If you want to skip specific files/folders which are marked as unneeded you
can use unneeded masks. So if you e.g. want to ignore unneeded SL or CHD
folders in your rom and sample paths you can use
`^.*\\(CHD|SL)$`


## Settings

settings.xml is created / loaded on startup which allows you to alter some
settings. The UI also stores all kind of data here. You can change the
default pre-strings for the -p command #TYPE# and #BIOSSPLIT# values.
So e.g. you can change #default to 'StandardSets' or similar. Only valid
path characters are allowed. Illegal values won't be accepted and the
defaults are used again. You can also specify a temporary folder here which
is used for temporary decompression purposes (e.g. when an archive is in an
archive) or when data needs to get recompressed.


## UI Version - Scanner

The scanner UI is a pretty simple and hopefully easy to use user interface
which can be used if you don't want to spend time with the commandline version.
It is divided in 3 main parts. The left side holds the options, the right side
will be filled with the scan results in a tree like view. At the bottom
messages are logged. Scanner result and log can be resized either by resizing
the window or you can resize vertically them by moving the mouse between log
and tree output and horizontally between options and tree output.

Options:
If you want to find out more about the meaning of the single options, check the
belonging commandline options description. Typically you have combo boxes which
allow entering rom/etc path names. Browse buttons to the right allow picking
files and folders from a dialog. If the browse button contains ... and a + it
supports picking multiple paths and it appends your selection to the already
existing one. A button labeled with a X clears the input. Holding SHIFT while
clicking removes all entries in the combo box. The combo boxes store
the last 5 previously selected unique inputs. Holding CTRL while clicking
removes the currently selected entry from the list and clears the current
input. The XML / Exe input is a bit enhanced over the pure commandline -x
option. It supports the import of XML data from a MAME binary. So you can
define e.g. `c:\mame\mame.exe -listxml` or `c:\mame\mame.exe -listsoftware`
and the scanner tries to export the data to its export folder and uses the
xml file from there. It names the file after the crc32 of the exe file plus
the used command. As soon as the binary changes, it will run another export,
otherwise it will load the already exported xml data. You can safely remove
files from the export folder from time to time. If you're on Linux/Wine, then
you can also use the direct import (via -listxml or -listsoftware) from a
Linux based MAME binary. There is a wrapper script (wrapper.sh) included which
is required to export the data. You need to do a "chmod +x ./wrapper.sh" on
it once.

Switching the XML/EXE combo box tries to restore the previously used option
settings for now chosen xml/exe. So you can switch between some kind of
profiles for different underlying xml data.
There is a context menu option which allows quick copying the current options
from the scanner to the rebuilder. Not all options are supported
but it's a quick way to take over options. The mapping is as follows:

| Scanner | -> | Rebuilder |
| --- | --- | --- |
| 1st Rom Path | -> | Output Path |
| Threads | -> | Threads |
| Log Level | -> | Log Level |
| Filter | -> | Filter |
| Pattern | -> | Pattern |
| Mode | -> | Mode |
| Backup Path | -> | Backup Path |
| XML/EXE | -> | XML/EXE |
| Add Paths | -> | Input Paths |
| Remove Matches | -> | Delete Rebuilt Files |
| Add Paths not empty | -> | Enable Recursive Input |
| Compression Level | -> | Compression Level |
| Compression Method | -> | Compression Method |

There is also a link option in the context menu which allows switching
the rebuilder XML/EXE combo box when the scanner's is switched. Identical
entries have to exist and you can't do it when an operation is running.

There is a small reset button at the top right of the XML/EXE combo section.
This removes all entries from the XML list which don't exist anymore.
Another reset button is at top of the paths secions. This clears all entries
in the combo boxes below.

Log:
The log list at the bottom lists all kind of information during the scan.
Different colors are used to show the log levels. Errors (e.g. when a file
operation fails) are marked red and at the end of the scan you also get the
information if there were errors. You can alter the width of the columns. The
log window has a context menu which allows you to enable/disable autoscrolling
or copy the listed data to the clipboard.

Scan Result:
The tree like output will be filled at the end of a scan (unless it was
cancelled) and shows what issues were found during the scan. The output is
sorted by software list + machine description. A context menu is available
which gives you various option to show/hide information. You can decide what
you want to see/hide: completely missing machines (so you don't have a single
rom/disk/sample for it), partly issued machines (e.g. some files are missing or
wrong named, etc), complete machines (100%, all files are there) and empty
machines (MAME has a lot of machines which are more for internal MAME use and
don't have any roms/disks/samples). You can also define what information is
printed out on rom/disk/machine level. So for example you can list entries with
the manufacturer or year, too. Or for roms you can also show size or sha1.
Copying data to the clipboard is also available.


## UI Version - Rebuilder

The rebuilder UI is a pretty simple and can be helpful if you prefer clicking
instead of using a commandline. It is divided in 3 main parts. The left side
holds the options and at the bottom messages are logged. On the right side you
can see a tree like output which lists rebuilt files. The used icon reflects
the state of the rebuilt machine. It differes between a green checked circle
(machine with all its roms and chds is in the output folder and at least
one file was newly rebuilt), blue checked circle (the machine is complete but
all files were skipped during rebuilt since they already exist). A half filled
circle corresponds to not complete machines (green/blue for new files added or
already existing). You can influence the output with the options from the
context menu.The rebuilder window itself can be resized. There are dividers
between the options and the tree and between log and tree/options.

Options:
If you want to find out more about the meaning of the single options, check the
belonging commandline options description. Typically you have combo boxes which
allow entering rom/etc path names. Browse buttons to the right allow picking
files and folders from a dialog. If the browse button contains ... and a + it
supports picking multiple paths and it appends your selection to the already
existing one. A button labeled with a X clears the input.  Holding SHIFT while
clicking removes all entries in the combo box. The combo boxes store
the last 5 previously selected unique inputs. Holding CTRL while clicking
removes the currently selected entry from the list and clears the current
input. The XML/EXE input is a bit enhanced over the pure commandline -x
option. It supports the import of XML data from a MAME binary. So you can
define e.g. `c:\mame\mame.exe -listxml` or `c:\mame\mame.exe -listsoftware`
and the rebuilder tries to export the data to its export folder and uses the
xml file from there. It names the file after the crc32 of the exe file plus
the used command. As soon as the binary changes, it will run another export,
otherwise it will load the already exported xml data. You can safely remove
files from the export folder from time to time. If you're on Linux/Wine, then
you can also use the direct import (via -listxml or -listsoftware) from a
Linux based MAME binary. There is a wrapper script (wrapper.sh) included which
is required to export the data. You need to do a "chmod +x ./wrapper.sh" on
it once.

Switching the XML/EXE combo box tries to restore the previously used option
settings for now chosen xml/exe. So you can switch between some kind of
profiles for different underlying xml data.
There is a context menu option which allows quick copying the current options
from the rebuilder to the scanner. Not all options are supported
but it's a quick way to take over options. The mapping is as follows:

|Scanner |<-| Rebuilder |
| --- | --- | --- |
| Rom Paths | <- | Output Path |
| Threads | <- | Threads |
| Log Level |<-| Log Level |
| Filter | <- | Filter |
| Pattern | <- | Pattern |
| Mode | <- | Mode |
| Backup Path | <- | Backup Path |
| XML/EXE | <- | XML/EXE |
| Add Paths | <- | InputPaths |
| Remove Matches | <- | Delete Rebuilt files |
| Compression Level | <- | Compression Level |
| Compression Method | <- | Compression Method |

There is also a link option in the context menu which allows switching
the scanner XML/EXE combo box when the rebuilder's is switched. Identical
entries have to exist and you can't do it when an operation is running.

Log:
The log list at the bottom lists all kind of information during the scan.
Different colors are used to show the log levels. Errors (e.g. when a file
operation fails) are marked red and at the end of the scan you also get the
information if there were errors. You can alter the width of the columns. The
log window has a context menu which allows you to enable/disable autoscrolling
or copy the listed data to the clipboard.


## Dir2Dat

Dir2Dat is a small datfile generator. It is able to run through a source folder
(optionally sub folders included), collects the file information (name, hashes
, size) and creates a datfile out of this information in the end. There are a
couple of options which allow modifying the datfile format and information
output, like lowercase entries or subfolder mode. In default mode, found
subfolders or archives are kept as machines where the single files in them
represent roms (or disks).

## The dir2dat specific commandline options

To use dir2dat, you need to use the "dir2dat" keyword.

The following additional commandline parameters can be used:

**-i, --input**
The path which should be scanned. This option is required unless you don't use
a profile (-p).

**-o, --output**
The to be generated datfile. If a file of that name already exist, it will be
overwritten.

**-p, --profile**
In case you used the UI version of clrmame, you can reference a dir2dat
profile here. All settings are taken from that stored information. If you
additionally specify commandline options, they have a higher priority.

**-m, --match**
You can set an existing datfile here which is read during creation and is used
for match manufacturer, description and year information. You can also use the
same name as -o here. So you can reuse already existing meta information when
you regenerate a datfile.

**-l, --loglevel**:
Specify the detail level of the output. By default this is set to **info**. You
can use **err**, **warn**, **info** or **trace** where the latter one
additionally lists source file and rebuild information. **info** shows you a
little progress bar here and there and gives you some updates when reading
folders. If you redirect your ouput, progress bars and updating file counts are
not visible.

**-r, --recursive**
Using this option will recursively run through the -i specified input paths.

**--lo, --lowercase**
All rom/disk/machine names/descriptions names will be made lowercase (unless
they are matched via -m)

**--as, --addSha1**
This will calculate the sha1 hash value for each file and adds it to the dat.

**--ad, --addMd5**
This will calculate the md5 hash value for each file and adds it to the dat.

**--ay, --addYear**
Adds an empty year element to the dat (unless they are matched via -m)

**--am, --addManufacturer**
Adds an empty manufacturer element to the dat (unless they are matched via -m)

**--af, --addManufacturerFromFolder**
Use machine's folder name as manufacturer element (unless they are matched)

**--sb, --subfolderMode**
Subfolder in archives are reused in rom/disk elements. Subfolders below
rootlevel are kept as one machine element each.

**--rm, --removeExtension**
Removes file extensions from the machine name/description

**--a0, --add0ForEmpty**
Add a 0 byte rom element for empty folders. Otherwise empty folders don't
appear in the dat.

**--sf, --singeFileMode**
Put each found non-archive file in a new machine element.

**--ah, --addHeader**
Adds a datfile header. There are additional options to fill in the single
header elements.

**--ka, --keepArchivesAsFiles**
Keeps found archives as files, so you get name/size and hash information
on the full archive and not the files in the archive.

**--kc --keepCHDsAsFiles**
Keeps found disks as files, so you get name/size and hash information
on the full disk and not the file stored in the chd.

**--sl --softlistXML**
Generates a softlist datfile instead of a standard one.

**--hn --hdrName**
**--hd --hdrDescription**
**--hg --hdrCategory**
**--hv --hdrVersion**
**--ht --hdrDate**
**--ha --hdrAuthor**
**--he --hdrEmail**
**--hh --hdrHomepage**
**--hu --hdrUrl**
**--hc --hdrComment**
With such options you can set the belonging entries in the datfile header


## UI Version - DirDat

The UI Version of dir2dat version needs one more required input. This is
a profile name, mainly a references where the current settings are stored
under. This name can be reused in the commandline version via -p.
The dir2dat window looks similar to the rebuilder or scanner with its typical
options/tree/log output. The tree is used to visualize the final datfile.


## History

#### 2026-01-24 clrmame V0.6.2 released

Core:
- fixed: calculating hashes on 7z archives with more than one file cause crash


#### 2026-01-15 clrmame V0.6.1 released

Core:
- added: non-merge mode
- added: dir2dat, optionally add md5 value to roms
- fixed: rebuilder, very rare case where a fully copied archive can prevent
copying a single but shared rom from a different set
(kof2k4se/kf10thep *.m1.bin)
- fixed: scanner, a scan (not new scan) isn't able to fix missing files from a
different machine which shares the missing file and is fully ok (and so not
included in the scan)
- fixed: scanner, when name check finds a wrong named machine archive and
wrong named (chd) machine folder, it took one as unneeded instead both as
wrong named
- misc:  reorganized hash getter to allow easier integration of any kind of
hash algorithm
- misc:  update spdlog to 1.17.0

UI:
- added: non-merge mode selection for rebuilder/scanner
- added: dir2dat option to optionally add md5 values
- fixed: redraw issue when grabbing, leaving the dialog and release
mousebutton


#### 2025-12-04 clrmame V0.6.0 released

Core:
- added: dir2dat module
- added: compression level option to set compression level for 7z/zip(zstd).
You can select "store", "fast", "normal", "best" (default)
- added: commandline version can use configurations from a toml or ini file
- misc: updated zipclass library with better compression level mapping for
zstd (zstd/best is pretty slow by the way)
- misc: updated CLI11 to 2.6.1
- fixed: don't overwrite settings.xml when using cmdline and ui version
- fixed: fixing a special fixable-missing + existing-but-wrongly-named
combination within a machine can lead to the same but inverted problem.
Unfortunately no message was shown so that the user thinks the files are ok
- misc: greatly improved sha1 prefetcher for solid 7z files
- misc: disabled multi folder selection under WINE since it's causing issues
there

UI:
- added: dir2dat module
- added: added compression level and compression method drop down in misc
tab
- misc: moved threads/loglevel to new misc tab
- misc: rebuilder: removed zstd / rezip zstd compression options, instead use
zip method drop down
- misc: scanner: removed auto detection of compression method, instead use zip
method drop down


#### 2025-10-16 clrmame V0.5.2 released

Core:
- fixed: temporary path getter can fail leading to unpacked files in rom,
sample or last used path
- misc:  updated spdlog to 1.16.0
- misc:  scanner, removed fileAccessible test for unneeded files in rompath,
it's pretty slow and normally all files within rompaths should be accessible
anyhow. If you now get warnings, please let me know

#### 2025-10-08 clrmame V0.5.1 released

UI:
- added: context menu options to take over compatible settings from current
Scanner to Rebuilder 'profile' and vice versa. Option is only available if no
scan (or rebuild) job is currently running. Not all options can be taken over,
mapping is described in the documentation
- added: context menu options to link Scanner and Rebuilder, so that if you
switch the XML/EXE in one, it will select the same in the other module (XML/EXE
needs to be setup in both first, e.g. with upper mentioned take over option).
This only works if the other module is currently not running.
- misc:  combobox, hold CTRL while clicking X additionally removes current entry
from list when clearing input, no more empty entries in the drop down list, use X
instead if you want to have an empty selection
- fixed: chd tab doesn't start with default settings (check V5 chds, no baddumps)

Core:
- fixed: 7z, addFiles(from, to) was limited to unique from files, i.e. multiple
additions from the same source (a->b, a->c) skipped creating files
- fixed: rebuilder, deletion of rebuilt unpacked files doesn't work when creating
archives
- fixed: rebuilder, creation of some destination files is skipped when a matched
file is used multiple times and is listed more than one time in the source machine
xml
- misc:  additionally give machine name when renaming roms/disks/samples due to
errors in dat


#### 2025-09-06 clrmame V0.5 released

UI:
- misc:  recoded resize/move/drag controls handling, allowing resizeable combo
boxes etc (resets your current positions once, though)
- misc:  moving options/modifiers in rebuilder/scanner to tabs, reducing used
vertical space
- misc:  aligned scanner/rebuilder look
- misc:  scanner, chd version check options are now visible in the chd tab
(before only in settings.xml)
- fixed: don't remember window placement on first close without changing tabs
- added: rebuilder, tree output for rebuilt files (context menu available, too)

Core:
- misc:  tweaked the determination of best fitting rompath a bit for users
which split chds/roms
- misc:  updated to spdlog 1.15.3
- fixed: scanner, updateMissingInformation phase isn't thread safe and could
lead to crash
- fixed: resolved potential reallocation/dangling pointer issues


#### 2025-04-20 clrmame V0.4 released

UI:
- scanner, fixed showing of empty and complete machines
- scanner, fixed showing of non rom/sample/disk related machine/file/folder
issues

Core:
- added: read/write support for zstandard compressed zip files
  - rebuilder, two zstd compression methods added, newly created/added files
  will use it (existing destination files which don't get replaced won't be
  updated). Source files which don't use the set compression method won't be
  taken into account when checking for a direct archive copy
  - scanner, automatically detects if you prefer zstd or deflate compressed
  zips and uses the preferred setting the case that new files are added.
  - with both, rebuilder/scanner you can theoretically end up with zip files
  which have files in it with different compression methods. Rebuilder: if you
  rebuild to existing files, scanner, if you already have a mixture of
  archives using different compression methods.
- added: scanner, reporting 'wrong' chd version. Current expected version is
5, warning is not shown for baddumps. Version and baddump warning can be
altered in settings.xml
- misc:  changed the behaviour of devices which have romOf dependencies. When
fully merging such devices they are now handled like parent/clone machines. On
the one hand it makes sense since there are merge attributes indicating that
such sets belong together but there are also cases where they are distinct.
This is now aligned with clrmamepro, but still something which might need
further discussion
- misc:  updated cli11 to 2.5.0
- misc:  updated spdlog to 1.15.2
- misc:  updated bit7z to 4.0.10
- misc:  scanner, using absolute pathnames for error reporting in path scan
- misc:  scanner, only trace-log machines with issues in fixing phase
- fixed: scanner, typo "uneeded"
- fixed: scanner: detection of empty but unneeded folders in archives
- fixed: scanner, backup of chds in software list collections can fail
- fixed: scanner, archives which match a valid rompath subfolder name
(e.g. softwarelist name via pattern or automatically) are falsely iterated
like a folder instead of listing them as unneeded (wrong placed)
- fixed: crash on rom definitions without a crc
- fixed: filter enrichment might miss sampleof dependencies
- fixed: doing archive backups from a folder and backup archive already
exists creates a new archive instead of merging files in


#### 2025-02-26 clrmame V0.3 released

UI:
- added: optionally hide/show roms/disks/samples output via context menu
- fixed: remove datasource XML file when hash folder changed (should only happen
  when XML file was internally created from an exe export)

Core:
- added: "available:" filter which limits the machine selection to files
  you have. You might notice a count difference in total vs filtered even when
  you got all files. This is based on the fact that a) empty sets are excluded
  and b) there are clones which are totally included in their parent, so -in
  split mode- you don't have a standalone file/folder for it. So don't worry
  about the count.
- misc:  "file:" filter, in case you work with softwarelist collections,
  you'd need to prefix single entries with sl-name#pacman (e.g. a2600#pacman)
  to specify which set you're refering to. Not needed for single sl files or
  standard dats though.
- misc:  some earlier cancel returns in fix wrong named disk/sample/rom/machine
- misc:  dupe output now shows all found paths belonging to a machine
- misc:  updated spdlog to 1.15.1
- misc:  limited AUTO thread switch to 25 threads as max (or less depending
  on your hardware). You can still manually select more or less if you like
- fixed: dupes can be listed multiple times
- fixed: backups for unneeded folders can create wrong and very long folders
  due to wrong encapsuling
- fixed: backup can miss empty unneeded folders at first scan
- fixed: fill-in file from addpath/backup path was removed (when requested)
  even when fill-in copy failed and you might had a 0 byte new archive
- fixed: log pattern in commandline mode


#### 2025-01-20 clrmame V0.2 released

UI:
- added: #thread selector
- added: scanner, Clipboard->Copy Machine Names option
- fixed: rebuilder log Copy To Clipboard misses linefeed

Core:
- added: you can define a thread pool size which is used for parallel scanning
  rebuilding/etc. Either you keep auto (default value) or pick the number of
  threads used for the pool yourself. "auto" picks a high value when you work
  with compressed files and a low value when using decompressed ones. A low
  value is better when you run into file seek overhead issues. If you're not
  happy with auto, you can try to tweak it yourself.
- fixed: scanner and rebuilder ignore output path, mergemode, pattern changes
  when xml data hasn't changed
- misc: more effective reimplementation of thread pools and thread queueing
- misc: prefetching sha1s on an archive if needed (speed increase, espically
  for archives with lots of files)
- misc: rebuilder: major update / reworked internal logic, less complex, way
  smaller memory footprint especially for dats out of hell (dats with
  thousands of dupe files), faster.
- misc: scanner, smaller memory footprint and speed update for "dats out of
  hell"
- fixed: scanner, accidently getting a crc32 on sub folders showing an error
- fixed: scanner, detection of unnneeded empty sub folders failed
- fixed: scanner, using old archive name when trying to remove some unneeded
  files causing can't remove files/can't access messages
- misc:  updated bit7z to 4.0.9
- misc:  updated pugixml to 1.15
- misc:  use "-" instead of "_" for replacing illegal file chars (cmpro align) 
- misc:  monitoring hash folder for -listsoftware exports. In case of a
  change, cached data is ignored and a new export is generated. This is more
  for users which update their hash folders during the development phase of
  MAME on a regular basis and don't want to recompile the binary.
- misc: use one version number, fixed a tooltip typo ;-)


#### 2024-12-03 clrmame / Scanner V0.08 / Rebuilder V0.15 released

UI:
- misc:  restore default window positions when positions get out of range
- misc:  increase display time of tooltips
- misc:  slightly regrouped ui elements
- added: scanner, context menu option to restore old scan results on startup
and selection

Core:
- misc:  updated spdlog to 1.15.0
- misc:  updated to 7zip 24.09
- misc:  don't reload xml and build up structures when they are already in
memory and need no refresh
- misc:  align and share rebuilder/scanner common routines
- added: scanner, samples, supporting flac
- fixed: scanner, typo in wine script
- fixed: scanner, internal sha1 check for files with identical crc32 but
different sha1 values isn't run
- fixed: scanner, samples, wrong named files (case check only) aren't fixed
- fixed: scanner, samples, wrong named files (case check only) also appear
as unneeded
- fixed: scanner, samples, decompressed files are marked as unneeded
- fixed: scanner, samples, machines which reference themselves via sampleOf
are marked as unneeded in full mode
- fixed: scanner, confusing 'can't remove/backup' messages in case of circular
renames between different sets
- fixed: scanner, confusing 'can't remove/backup' messages in case a wrong
file with a right name is replaced with a fill in


#### 2024-11-01 clrmame / Scanner V0.07 / Rebuilder V0.14 released

UI:
- misc: scanner, contextmenu option to sort by name instead of description
- misc: Linux/Wine, users can get data from a (linux) mame binary. A wrapper
script is included (requires a chmod +x ./wrapper.sh once though)
- misc: Linux/Wine, minor changes regarding line breaks or layout

Core:
- misc:  in case of a -listsoftware data basis (either exe export or dat),
additional sl hashes are added from either the used exe's hash folder (prio 1)
or HashFolder specified in the settings xml file (prio 2).
- fixed: some absolute paths aren't made 32k path length aware, this can lead
to sideeffects, e.g. when doing backups (esp. in Linux/Wine)
- fixed: scanner, removal of matched files from addPath/backup only works when
at least one file was missing
- fixed: scanner, removal of unneeded files stopped after first error during a
delete process so that other files in the queue were skipped
- fixed: scanner, fixing wrong named files/folders which only differ by a
character case change fails
- fixed: scanner, unneeded sample files are detected but not reported or fixed
- fixed: scanner, (rare) "can't read" error message when accidently testing a
folder for being a chd
- fixed: scanner, (rare) fixing wrong named files/folders which only differ by
a character case but map to multiple machines keeps only 1 instance (e.g.
3DO->3do & 3dobios)
- fixed: scanner, circular renames inside a machine throw rename errors and
only get resolved in a second run


#### 2024-09-27 clrmame / Scanner V0.06 / Rebuilder V0.13 released

Core:
- misc:  set current folder change back to application folder after
scan/rebuild to avoid folder locking
- fixed: scanner, already moved "missing but fixable" unpacked files removes
the moved file afterwards when they were also marked as unneeded in their
source position
- fixed: scanner, a failed file operation on "missing but fixable" removes the
file when they were also marked as unneeded in their source position
- fixed: obsolete error message when using file filters
- fixed: avoid 'can't read' ios_base::failbit set errors on files where the
file length is smaller than the minimal buffer for the operation (e.g. id or
header read)
- fixed: scanner, obsolete "can't access" messages during fix unneeded due to
not correctly filtering already removed archives


UI:
- misc:  SHIFT clicking the clear buttons clear the full combo box
- misc:  when context menu show info->size is enabled, also show the sum on
software list level
- misc:  remember last used tab on close/start
- fixed: restore window positions incl. vertical break isn't always working


#### 2024-09-19 clrmame / Scanner V0.05 / Rebuilder V0.12 released

Combined rebuilder/scanner in one commandline and one UI version, let's name
it clrmame for now

Core:

- misc:  scanner, remove matches from AddPath option now removes ALL matches,
no matter if they were just real fill-ins for a missing file or already
existing in a rompath
- misc:  hiding sample specific problems while loading datfile in rebuilder
and in scanner when no sample paths are set
- misc:  scanner, skip not accessible unneeded files (loglevel trace can show
information about them)
- fixed: using relative paths fails after first scan due to wrong current
folder
- fixed: when an elapsed time takes less than 0 seconds, no value as elapsed
time is shown ;-)
- fixed: hash calculation of empty files
- fixed: scanner, some unneeded files won't be removed but aren't reported
either when fix is enabled

UI:

- added: scanner, context menu option to show clone names (in full merge mode
only) which have issues right beside the set name up to 5, if there are more,
you can see all in tooltips
- misc:  merged rebuilder and scanner, switchable via tabs, can run in
parallel (unless path sharing issues are detected)


#### 2024-08-30 Scanner V0.04 / Rebuilder V0.11 released

Core:

- added: scanner, option to remove fill-in files from addpaths (this can
include backup when 'act as' option is on). Only files which were actually
needed in the rompaths are removed
- misc:  changed elapsed time format and show an overall duration
- fixed: handling of loadflag="continue" fails in software list based xmls
- fixed: scanner, fixing missing roms/disks can accidently move the rom/disk
when it shouldn't
- fixed: scanner, parentfolders of fill-in files can be touched (timestamp
update) when they shouldn't
- fixed: scanner, fix missing roms/disks can accidently look in a folder where
the match is not present leading to can't access/can't backup messages

UI:

- added: checkbox for upper mentioned remove fill-in files option
- misc: updated some flyovers



#### 2024-08-22 Scanner V0.03 / Rebuilder V0.10 released

UI:

- added: scanner, always have software list collection sublevels (before a
complete sl list only showed the complete top level)
- added: scanner, context menu option to auto expand scan results tree
- added: scanner, context menu list options are now also available on software
list level
- fixed: active paths are not always remembered correctly when switching xmls
or on initial load
- fixed: scanner, empty sofware lists show complete instead of empty icon
- fixed: scanner, scan results tree/log split isn't refreshed after
minimizing/maximizing window 

Core:

- added: long path/filename support (32k)
- misc:  update to bit7z 4.0.8
- misc:  scanner: propagate changed file timestamp up to rom/samplepath level
- misc:  scanner: improved preferredPackMethod a bit
- misc:  scanner: whenever possible, move (instead of copy/delete) missing
but fixable chds or decompressed roms
- fixed: erroneously allow \ as an machine name character
- fixed: scanner, stopping the scan during fix operations doesn't have an
effect
- fixed: scanner, freeze when fixing unneeded decompressed files (or chds)
(0.02.1)
- fixed: scanner, wrong named files which are also fill-ins for missing files
can trigger an obsolete file removal operation (sdiamond MAME.268)
- fixed: scanner, unneeded files/folders contain multiple identical entries
(aa3020 MAME.268)
- fixed: scanner, fixing a wrong named folder failed when additional pattern
folders were involved (copy error)
- fixed: scanner, fixing a wrong named folder failed when folder can have
multiple new names (aa3020/a3010/aa5000 MAME.268)  (move error)
- fixed: scanner, preferred path lookup for software list collections never
returned additional pattern folders (copy error)
- fixed: scanner, preferred path lookup for wrong named machines which only
consist of chds failed (freeze during backup)
- fixed: scanner, removal of unneeded files can result in an empty folder
which isn't removed



#### 2024-07-15 Scanner V0.02 / Rebuilder V0.09 released

UI:

- added: scanner log *and* tree can be X/Y resized now (mouse click+move
between the 2 controls)
- added: clear button to clear all combo boxes for current 'profile'
(rebuilder and scanner)
- added: clear button to remove not existing XML/EXE entries
- misc: increased the number of remembered xml entries to 10 (rebuilder
and scanner)
- added: controls for unnneeded masks, include backup as addpath
- misc: improved storing of per-xml data
- fixed: filtermode setting isn't restored


Core:

- added: support more XML based datfiles
- added: unneeded masks (regular expression to filter out specifc
unneeded files/folders) (e.g. ^.*\\(CHD|SL)$)
- added: handle backup as addpath option
- added: filter option supports prefix 'file:' to specify a file holding
setnames used for filtering
- fixed: several obsolete "can't remove" errors based on not taking an in
between rename into account
- fixed: machine/rom/disk/sample names may contain illegal path
characters (incl. ending with .). Remove characters and warn about it
- fixed: prevent rom/disk with identical names but different hashes to get
overwritten in decompressed standalone or full merged mode (e.g. sgi_mips
chds). Filename gets prefixed with hash. MAME might fail loading it but at
least the file is not lost
- fixed: fix-missing can skip one file from a batch and doesn't fix or
report it
fixed: application quits when directory scans cause an exception (e.g.
access right denied)


#### 2024-06-20 Scanner V0.01 released
- initial release

#### 2024-06-20 Rebuilder V0.08 released
- only settings.xml is used now
- added required backup path option
- added xpath filtering
- added multiple input paths support
- using combined scanner/rebuilder core now
- updated 3rd party libs

#### 2023-11-28 Rebuilder V0.07 released
- fix software list rom sizes determination (wasn't limited to loadflag value)
- fix software list merging (SL/SL collections don't use merge attributes, so
lookup by hash in a parent/clone relationship)
- don't use # in default pattern (rompath) names since such names would be cut
off when used in mame.ini due to comment handling
- pattern names can't end with '.' (Windows doesn't like this), replaced cases
with "_"
- minor changes to the stats count output
- updated 3rd party libs (spdlog, bit7z, pugixml)

#### 2023-05-04 Rebuilder V0.06 released
- run source and destination file matching in multiple threads (speed up)

#### 2023-04-14 Rebuilder V0.05 released
- general unicode handling overhaul, utf8 chars in pathnames, patterns, xml,
files, folders, archives, console output should be fine now

#### 2023-03-12 Rebuilder V0.04 released
- support reading of (split)rar/(not split)7z and writing of 7z files
- detection of zip, 7z, rar, chd files by byte signature (instead of
extensions, but not within archives)
- selectable tempfolder in settings.xml
- minor speed up due to upfront matching size check
- updated various 3rd party libs, added 7z.dll
- ctrl-c will stop the rebuilding and cleans up temporary files/folders
- various internal cleanup

#### 2022-10-05 Rebuilder V0.03 released
- use a real move operation in case of copy/deleting single files (incl. chds)
- add option -u, --uselinks to generate filesystem hard or sym links instead
doing a file copy or move operation

#### 2022-08-16 Rebuilder V0.02 released
- since MAME can't handle subfolders in decompressed sets, decompressed sets
and chds are always stored flat in folders (no clone/dependency subfolders in
full or standalone mode). When kept compressed, the archives will hold
subfolders
- not existing romOf reference leads to removed merge information for the
machine

#### 2022-07-13 Rebuilder V0.01 released
- initial release

## What are the benefits over clrmamepro

Besides of the -in my opinion- way better code, worth mentioning is:
- loading in a datfile is extremely fast compared to clrmamepro so that
actually nothing has to be cached
- rebuilder supports chd rebuilding which was requested so many times
- rebuilder supports hard/symlinks which speeds up processing and saves disk
space and offers you a way to keep machine sets in different merge modes
without spending too much extra disk space
- it supports the standalone mode which makes so much more sense than the old
unmerged one
- rebuilder is able to detect sets (no matter how they are named) which can be
directly copied
- you can safely load a -listsoftware MAME xml output and can rebuilt all
software lists at one go and of course you can scan them easily without the
need of a complex path setup
- codewise, there is no MS MFC use, just plain STL and 3rd party libs inside
the core. Core and UI are strictly independent.
- different merge modes are now handled completely differently as in clrmamepro
where you usually have special rules for "if mergemode is full and machine is a
clone then do this...else this". The new rebuilder and scanner work on a view
on the xml data, i.e. in terms of rebuilding data there is no difference at all
between the modes. The view either corresponds to full, split or standalone data
which makes the actual action totally independent from the chosen mode. Due to
the file/folder strucure in full and standalone modes you get rid of possible
problems when roms have the same name but different hash within a parent/clone
relationship.

On the other hand I of course understand if users start to moan "but it does
not have feature x and y", "it does not support datfile type z" and so on. Yes
this is the case but currently I don't want to implement requests which might be
used by 1% of the users. Time will tell what comes next.


## Future Plans / Source Availability

In this state of the project it is closed source, mainly due to the use of the
full version licence of ZipArchive.

There are definetly plans for the future to get open source. Currently there
are discussions about some licences (e.g. free version of ZipArchive is currently
GPL)

Things I'm interested in:
- a profiler
- your feedback, bug reports, ideas and suggestions

Things I'm not interested in:
- Supporting anything old, adding exceptional handling for weird case x and y,
anything non MAME related


## Bug Reporting / Donation

If you found something spooky, have problems, feel free to use the clrmamepro
forum: https://www.emulab.it/forum/index.php?board=6.0

If you're totally happy with it, feel free to donate ;-)
https://mamedev.emulab.it/clrmamepro/#donate


## Third party licence information


### Zip Handling: [ZipArchive](http://www.artpol-software.com/ZipArchive/)

ZipArchive Library 4.6.9 Copyright (c) Tadeusz Dracz

Currently using the 'full version' licence. Making the product currently closed
source.


### 7z/Rar Handling [Bit7z](https://github.com/rikyoz/Bit7z)

Bit7z v4.0.10 Copyright (c) 2014-2025 Riccardo Ostani 

https://github.com/rikyoz/bit7z/blob/master/LICENSE

MPLv2 License

You can obtain a copy of the MPLv2 License here https://mozilla.org/MPL/2.0/

7z.dll is used which is part of the 7-Zip program. 7-Zip is licensed under the
GNU LGPL license. You can find 7-zip including source code at
https://www.7-zip.org


### CLI Parser: [CLI11](https://github.com/CLIUtils/CLI11)

CLI11 2.6.1 Copyright (c) 2017-2025 University of Cincinnati, developed by Henry
Schreiner under NSF AWARD 1414736. All rights reserved.

https://github.com/CLIUtils/CLI11/blob/main/LICENSE

Redistribution and use in source and binary forms of CLI11, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
   may be used to endorse or promote products derived from this software without
   specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


### XML Parser: [PugiXml](https://github.com/zeux/pugixml)

pugixml 1.15 Copyright (c) 2006-2025 Arseny Kapoulkine

https://github.com/zeux/pugixml/blob/master/LICENSE.md

MIT License (MIT) (see below)


### Logging: [SpdLog](https://github.com/gabime/spdlog)

SpdLog 1.17.0 Copyright (c) 2016-2026 Gabi Melman. 

https://github.com/gabime/spdlog/blob/v1.x/LICENSE

MIT License (MIT)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.


### SHA1 calculation: [CSHA1](https://www.dominik-reichl.de/)

CSHA1 2.1

100% free public domain implementation of the SHA-1 algorithm
by Dominik Reichl <dominik.reichl@t-online.de>


### MD5 calculation: [MD5](https://github.com/yaoyao-cn/md5)

MD5
Converted to C++ class by Frank Thilo (thilo@unix-ag.org)
for bzflag (http://www.bzflag.org)
 
based on:

md5.h and md5.c
reference implementation of RFC 1321
 
Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All
rights reserved.
 
License to copy and use this software is granted provided that it
is identified as the "RSA Data Security, Inc. MD5 Message-Digest
Algorithm" in all material mentioning or referencing this software
or this function.
 
License is also granted to make and use derivative works provided
that such works are identified as "derived from the RSA Data
Security, Inc. MD5 Message-Digest Algorithm" in all material
mentioning or referencing the derived work.
 
RSA Data Security, Inc. makes no representations concerning either
the merchantability of this software or the suitability of this
software for any particular purpose. It is provided "as is"
without express or implied warranty of any kind.
 
These notices must be retained in any copies of any part of this
documentation and/or software.