Overview
The two DAOS control files are daoscat.nsf and daos.cfg. In general, it is a very bad idea
to delete them. Doing so can lead to widespread attachment
unavailability, unnecessary duplication of data, unnecessary traffic to
daoscat.nsf, and difficulties in restoring .NLO files.
Background
A quick tour of some aspects of DAOS:
The daos.cfg file contains a list of the directory paths under
the DAOS repository root, and a count of the number of .NLO files stored
in each one. The count information is used to ensure that a limited
number of .NLO files are stored in each directory for filesystem
performance reasons.
The daoscat.nsf file contains three major items:
1) The DAOS Object Index (DOI) which is a list of the key (.NLO
filename), reference count, and location of every .NLO file in the
repository.
2) The DAOS ID Table (DIT) which is a list of all NSF files
participating in DAOS, with the total number of NLO references for each
.NSF file.
3) The deletion list is a collection of the keys that have 0
references, with the datestamp of when the last reference went away.
When an attachment is stored in DAOS, a 'ticket' is written to the NSF
as a placeholder. This ticket contains the key for the attachment, as
well as a location hint. If the hint is incorrect, the DAOS catalog is
consulted to look up the key to find the current location for the .NLO
file.
If daos.cfg is deleted:
If there is no daos.cfg
file when the API is started, DAOS will begin at the root of the DAOS
repository and enumerate the files in each subdirectory. This is a
relatively costly filesystem operation, and can take a significant
amount of time, possibly several hours if there are a large number of
.NLO files in the repository.
This startup delay often appears to be a hang, and a common
reaction is to kill the process and start it again. If the counting
operation is interrupted in this manner, it's possible for an incomplete
daos.cfg file to be produced.
All location references in the DOI and in the DAOS tickets are a
numeric index into the list of paths in this file. If the file is
incomplete and no path exists at the requested index, or if the wrong
path is listed at the index, the resulting path constructed for the .NLO
file access will be invalid, and the access will fail.
If nothing else, rebuilding this file is usually just a waste of
time and machine resources. The file counts here do not have to be
exact, and unless the file is corrupted, a full data restore is being
performed, or the file is otherwise known to contain significantly
incorrect information, there is no benefit to re-counting all of the
files and re-creating it.
If daoscat.nsf is deleted:
DAOS will create
a new daoscat.nsf if one does not exist at API startup. This file will
contain an empty DIT, DOI, and deletion list.
When a new attachment is received by DAOS, it calculates the key
(checksum) of the contents, and then looks up the key in the DOI to see
if the attachment already exists. If it does not already exist, a new
entry is created, and the location of this new .NLO file is stored in
the DOI and in the DAOS ticket in the NSF. If the DOI was empty due to
daoscat being deleted, DAOS will not be able to locate a
previously-existing copy of this new .NLO file, so the new one will be a
duplicate. This duplicate .NLO is a waste of disk space.
When DAOS resync is run at some point later, the DOI will be
populated, and the duplicate will be detected and deleted. The ticket
in the NSF will not be updated however, so the location hint will be
incorrect. When this attachment needs to be read, it will first look at
the hint location in the ticket, which will fail. It will then use the
key to look in the DOI for the correct location of the .NLO file, which
will succeed. The retry logic allows access to the attachment, but it
increases the traffic on daoscat.nsf.
As of 8.5.2, DAOS resync keeps the old DOI active while it is
populating a new one. This allows DAOS to be able to correctly look up
the location of NLO files. An empty DOI is of no use in this respect.
If it is necessary to restore an .NSF file, the following command
will display a list of the .NLO files that do not currently reside in
the DAOS repository, and need to be restored for all attachments to be
available:
LISTNLO MISSING
The output of this command is based on the
hint location in the DAOS tickets. If the hint location is incorrect
because a duplicate .NLO file was deleted, it will be more difficult to
restore the appropriate .NLO files since they won't have been backed up
at the location displayed.
DAOS resync will also re-populate the deletion list. All entries
added to the deletion list will be datestamped with the time of the
resync. The DAOS prune operation will wait until the datestamps are
older than the deferred deletion interval, so resetting the datestamps
of these entries will delay the prune operations, causing unnecessary
disk space usage.
Even if the DAOS catalog is not in Synchronized state, it
contains valuable information. Again, unless the file is corrupted, a
full data restore is being performed, or the file is otherwise known to
contain significantly incorrect information, there is no benefit to
re-creating it.
Recovering from deleted files;
If daos.cfg
is deleted, it will be recreated at startup. It is crucial that the
creation process be allowed to complete so that a correct daos.cfg is
produced.
If daoscat.nsf is deleted, the server should not be operated
until a DAOS resync has been performed. The safest way to do this is to
run a standalone DAOS resync before the server is started.
On 8.5.2 and newer environments, use the following command to get to the minimum usable state before the server is operated:
RESYNC QUICK
This option will populate the DIT and DOI,
but will not update the reference counts. All DAOS functionality will
be completely operational after a RESYNC QUICK except for prune. Prune
will not run until the reference counts have been updated by a normal
resync, and the DAOS catalog is in Synchronized state.
Once the DAOS catalog is in Synchronized state, a fixup operation
on an NSF will update the hint location of the DAOS tickets in the NSF
with the current information from the DOI.
Is it ever OK to delete these files?
If all
data is being restored (as in the case of a disaster recovery scenario)
and all .NSF and .NLO data is being recovered from backup, the daos.cfg
should be deleted, and an offline resync (or RESYNC QUICK at a minimum)
should be performed to get an accurate accounting of all files and
references. These two files should NOT be restored from the backup in
this situation.
If either of these files were accidentally copied or restored
from a different server or timeframe, or if either is corrupted beyond
repair, deleting them (and recreating them as described above) is an
acceptable response.
Other than that...you're better off leaving the existing files and doing a resync.