Skip to content

file types

snoop.data._file_types private #

Constant definitions for mime type - file type mapping.

The mime type is the one returned by libmagic in snoop.data.magic.

The "file type" is a user-friendly category of a mime type. It's stored on the documents as filetype, used for logic in switches, and presented in the UI as a first-class attribute of the document. Examples of file types: "folder", "email", "archive".

Not all mime types have a file type bound to them.

Attributes#

FILE_TYPES #

Mapping from mime types to Hoover file types.

Used by snoop.data.digests.get_filetype.

Functions#

allow_processing_for_mime_type(mime_type, sample_extension) #

Check if we want to skip processing the document, based on mime type and extension.

We check if the given mime_type is listed in settings.SNOOP_SKIP_PROCESSING_MIME_TYPES. We also check if the given sample_extension is listed in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS.

We also check if the file extension guessed by the mimetypes module is listed in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS.

Source code in snoop/data/_file_types.py
def allow_processing_for_mime_type(mime_type, sample_extension):
    """Check if we want to skip processing the document, based on mime type and extension.

    We check if the given `mime_type` is listed in `settings.SNOOP_SKIP_PROCESSING_MIME_TYPES`.
    We also check if the given `sample_extension` is listed in `settings.SNOOP_SKIP_PROCESSING_EXTENSIONS`.

    We also check if the file extension guessed by the `mimetypes` module is listed in
    `settings.SNOOP_SKIP_PROCESSING_EXTENSIONS`. """
    if mime_type in settings.SNOOP_SKIP_PROCESSING_MIME_TYPES:
        log.warning('skipping document with mime type = "%s"', mime_type)
        return False
    ext = mimetypes.guess_extension(mime_type)
    if ext in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS:
        log.warning('skipping document with guessed extension = "%s"', ext)
        return False
    if sample_extension and sample_extension in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS:
        log.warning('skipping document with filename extension = "%s"', sample_extension)
        return False
    return True