file types
snoop.data._file_types
private
#
Constant definitions for mime type - file type mapping.
The mime type is the one returned by libmagic in snoop.data.magic.
The "file type" is a user-friendly category of a mime type. It's stored on the documents as filetype, used
for logic in switches, and presented in the UI as a first-class attribute of the document. Examples of file
types: "folder", "email", "archive".
Not all mime types have a file type bound to them.
Attributes#
FILE_TYPES
#
Mapping from mime types to Hoover file types.
Used by snoop.data.digests.get_filetype.
Functions#
allow_processing_for_mime_type(mime_type, sample_extension)
#
Check if we want to skip processing the document, based on mime type and extension.
We check if the given mime_type is listed in settings.SNOOP_SKIP_PROCESSING_MIME_TYPES.
We also check if the given sample_extension is listed in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS.
We also check if the file extension guessed by the mimetypes module is listed in
settings.SNOOP_SKIP_PROCESSING_EXTENSIONS.
Source code in snoop/data/_file_types.py
def allow_processing_for_mime_type(mime_type, sample_extension):
"""Check if we want to skip processing the document, based on mime type and extension.
We check if the given `mime_type` is listed in `settings.SNOOP_SKIP_PROCESSING_MIME_TYPES`.
We also check if the given `sample_extension` is listed in `settings.SNOOP_SKIP_PROCESSING_EXTENSIONS`.
We also check if the file extension guessed by the `mimetypes` module is listed in
`settings.SNOOP_SKIP_PROCESSING_EXTENSIONS`. """
if mime_type in settings.SNOOP_SKIP_PROCESSING_MIME_TYPES:
log.warning('skipping document with mime type = "%s"', mime_type)
return False
ext = mimetypes.guess_extension(mime_type)
if ext in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS:
log.warning('skipping document with guessed extension = "%s"', ext)
return False
if sample_extension and sample_extension in settings.SNOOP_SKIP_PROCESSING_EXTENSIONS:
log.warning('skipping document with filename extension = "%s"', sample_extension)
return False
return True