Skip to content
This repository was archived by the owner on Aug 25, 2024. It is now read-only.

operation : archive : Add support for zip,tar, etc archives #1128

Merged
merged 3 commits into from
Jul 1, 2021
Merged

operation : archive : Add support for zip,tar, etc archives #1128

merged 3 commits into from
Jul 1, 2021

Conversation

programmer290399
Copy link
Contributor

@programmer290399 programmer290399 commented Jun 13, 2021

@codecov-commenter
Copy link

codecov-commenter commented Jun 13, 2021

Codecov Report

Merging #1128 (f8030a8) into master (f0d826f) will increase coverage by 0.17%.
The diff coverage is 97.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1128      +/-   ##
==========================================
+ Coverage   84.46%   84.63%   +0.17%     
==========================================
  Files         152      156       +4     
  Lines       10040    10180     +140     
  Branches     1662     1677      +15     
==========================================
+ Hits         8480     8616     +136     
- Misses       1213     1215       +2     
- Partials      347      349       +2     
Impacted Files Coverage Δ
dffml/operation/archive.py 84.61% <84.61%> (ø)
dffml/operation/compression.py 100.00% <100.00%> (ø)
tests/operation/test_archive.py 100.00% <100.00%> (ø)
tests/operation/test_compression.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f0d826f...f8030a8. Read the comment docs.

@programmer290399 programmer290399 changed the title WIP : util : file : Add format agnostic archive extraction/creation functions WIP : operations : archive : Add support for zip,tar, etc archives Jun 22, 2021
@programmer290399 programmer290399 changed the title WIP : operations : archive : Add support for zip,tar, etc archives WIP : operation : archive : Add support for zip,tar, etc archives Jun 22, 2021
@programmer290399 programmer290399 changed the title WIP : operation : archive : Add support for zip,tar, etc archives operation : archive : Add support for zip,tar, etc archives Jun 29, 2021
@johnandersen777
Copy link

Otherwise maintain current approach of file path based and re-name definitions to include prefix for compressed/decompressed

  • Input_file_path -> compressed_file_path
  • Make sure to output decompressed/compressed_file_path

@johnandersen777
Copy link

Creating operations for each format with a loop

diff --git a/dffml/operation/compression.py b/dffml/operation/compression.py
index 7d5b16f4f..f0a41fbd9 100644
--- a/dffml/operation/compression.py
+++ b/dffml/operation/compression.py
@@ -7,80 +7,77 @@ from pathlib import Path
 from ..df.base import op
 from ..df.types import Definition
 
-# definitions
-INPUT_FILE_PATH = Definition(name="input_file_path", primitive="str")
-OUTPUT_FILE_PATH = Definition(name="output_file_path", primitive="str")
-FORMAT = Definition(name="format", primitive="str")
 
-SUPPORTED_COMPRESSION_FORMATS = {".gz": gzip, ".bz2": bz2, ".xz": lzma}
+def make_compress(extension, compression_cls):
+    async def compress(
+        input_file_path: str, output_file_path: str,
+    ):
+        f"""
+        A simple function to compress a {extension} file.
 
+        Parameters
+        ----------
+        input_file_path : str
+            Path of the file to be compressed.
+        output_file_path : str
+            Path where the output should be saved (should include file name).
+        """
+        with open(input_file_path, "rb") as f_in:
+            with compression_cls.open(output_file_path, "wb") as f_out:
+                shutil.copyfileobj(f_in, f_out)
 
-class UnsupportedCompressionFormatError(Exception):
-    def __init__(self, format):
-        super().__init__()
-        self.format = format
+    return compress
 
-    def __str__(self):
-        return f"{self.format} format is not currently supported."
 
+def make_decompress(extension, compression_cls):
+    async def decompress(input_file_path: str, output_file_path: str):
+        f"""
+        A simple function to decompress a {extension} file.
 
-def get_compression_class(format):
-    compression_cls = SUPPORTED_COMPRESSION_FORMATS.get(format, None)
-    if compression_cls is None:
-        raise UnsupportedCompressionFormatError(format)
-    return compression_cls
+        Parameters
+        ----------
+        input_file_path : str
+            Path of the file to be decompressed.
+        output_file_path : str
+            Path where the output should be saved (should include file name).
+        """
+        input_file_path = Path(input_file_path)
+        file_format = input_file_path.suffix
+        with compression_cls.open(input_file_path, "rb") as f_in:
+            with open(output_file_path, "wb") as f_out:
+                shutil.copyfileobj(f_in, f_out)
 
+    return decompress
 
-@op(
-    inputs={
-        "input_file_path": INPUT_FILE_PATH,
-        "output_file_path": OUTPUT_FILE_PATH,
-        "file_format": FORMAT,
-    },
-    outputs={},
-)
-async def compress(
-    input_file_path: str, output_file_path: str, file_format: str
-):
-    """
-    A simple function to compress a file in a certain format.
 
-    Parameters
-    ----------
-    input_file_path : str
-        Path of the file to be compressed.
-    output_file_path : str
-        Path where the output should be saved (should include file name).
-    file_format : str
-        Format of the compressed output file.
-    """
-    compression_cls = get_compression_class(file_format)
-    with open(input_file_path, "rb") as f_in:
-        with compression_cls.open(output_file_path, "wb") as f_out:
-            shutil.copyfileobj(f_in, f_out)
+SUPPORTED_COMPRESSION_FORMATS = {"gz": gzip, "bz2": bz2, "xz": lzma}
 
-
-@op(
-    inputs={
-        "input_file_path": INPUT_FILE_PATH,
-        "output_file_path": OUTPUT_FILE_PATH,
-    },
-    outputs={},
-)
-async def de_compress(input_file_path: str, output_file_path: str):
-    """
-    A simple function to de_compress a file.
-
-    Parameters
-    ----------
-    input_file_path : str
-        Path of the file to be decompressed.
-    output_file_path : str
-        Path where the output should be saved (should include file name).
-    """
-    input_file_path = Path(input_file_path)
-    file_format = input_file_path.suffix
-    compression_cls = get_compression_class(file_format)
-    with compression_cls.open(input_file_path, "rb") as f_in:
-        with open(output_file_path, "wb") as f_out:
-            shutil.copyfileobj(f_in, f_out)
+for extension, compression_cls in SUPPORTED_COMPRESSION_FORMATS.items():
+    # Create definitions for compressed/decompressed file path for this format
+    compressed_file_path = Definition(
+        name=f"compressed_{extension}_file_path", primitive="str"
+    )
+    decompressed_file_path = Definition(
+        name=f"decompressed_{extension}_file_path", primitive="str"
+    )
+    # Create the compression function, and wrap it with the op decorator to make
+    # it an operation / operation implementation
+    compress = op(
+        inputs={
+            "input_file_path": decompressed_file_path,
+            "output_file_path": compressed_file_path,
+        },
+        outputs={},
+    )(make_compress(extension, compression_cls))
+    # Do the same for decompress
+    decompress = op(
+        inputs={
+            "input_file_path": compressed_file_path,
+            "output_file_path": decompressed_file_path,
+        },
+        outputs={},
+    )(make_decompress(extension, compression_cls))
+    # At the global scope of this file, gz_compress gets set to the return value
+    # of make_compress, wrapped by op.
+    setattr(sys.modules[__name__], f"{extension}_compress", compress)
+    setattr(sys.modules[__name__], f"{extension}_decompress", decompress)

@johnandersen777
Copy link

Full source

import bz2
import gzip
import lzma
import shutil
from pathlib import Path

from ..df.base import op
from ..df.types import Definition


def make_compress(extension, compression_cls):
    async def compress(
        input_file_path: str, output_file_path: str,
    ):
        f"""
        A simple function to compress a {extension} file.

        Parameters
        ----------
        input_file_path : str
            Path of the file to be compressed.
        output_file_path : str
            Path where the output should be saved (should include file name).
        """
        with open(input_file_path, "rb") as f_in:
            with compression_cls.open(output_file_path, "wb") as f_out:
                shutil.copyfileobj(f_in, f_out)

    return compress


def make_decompress(extension, compression_cls):
    async def decompress(input_file_path: str, output_file_path: str):
        f"""
        A simple function to decompress a {extension} file.

        Parameters
        ----------
        input_file_path : str
            Path of the file to be decompressed.
        output_file_path : str
            Path where the output should be saved (should include file name).
        """
        input_file_path = Path(input_file_path)
        file_format = input_file_path.suffix
        with compression_cls.open(input_file_path, "rb") as f_in:
            with open(output_file_path, "wb") as f_out:
                shutil.copyfileobj(f_in, f_out)

    return decompress


SUPPORTED_COMPRESSION_FORMATS = {"gz": gzip, "bz2": bz2, "xz": lzma}

for extension, compression_cls in SUPPORTED_COMPRESSION_FORMATS.items():
    # Create definitions for compressed/decompressed file path for this format
    compressed_file_path = Definition(
        name=f"compressed_{extension}_file_path", primitive="str"
    )
    decompressed_file_path = Definition(
        name=f"decompressed_{extension}_file_path", primitive="str"
    )
    # Create the compression function, and wrap it with the op decorator to make
    # it an operation / operation implementation
    compress = op(
        inputs={
            "input_file_path": decompressed_file_path,
            "output_file_path": compressed_file_path,
        },
        outputs={},
    )(make_compress(extension, compression_cls))
    # Do the same for decompress
    decompress = op(
        inputs={
            "input_file_path": compressed_file_path,
            "output_file_path": decompressed_file_path,
        },
        outputs={},
    )(make_decompress(extension, compression_cls))
    # At the global scope of this file, gz_compress gets set to the return value
    # of make_compress, wrapped by op.
    setattr(sys.modules[__name__], f"{extension}_compress", compress)
    setattr(sys.modules[__name__], f"{extension}_decompress", decompress)

@programmer290399
Copy link
Contributor Author

programmer290399 commented Jun 30, 2021

Otherwise maintain current approach of file path based and re-name definitions to include prefix for compressed/decompressed

  • Input_file_path -> compressed_file_path
  • Make sure to output decompressed/compressed_file_path

Related Enhancement Issue: #1145

@johnandersen777
Copy link

Very nice, thanks for the quick turn-around. Merged

pdxjohnny and others added 3 commits June 30, 2021 11:04

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Remove redundant "Added"

Signed-off-by: John Andersen <johnandersenpdx@gmail.com>

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
@johnandersen777 johnandersen777 merged commit 5edb093 into intel:master Jul 1, 2021
@johnandersen777
Copy link

Had to rebase things into logical commits. I see you saw that I'd been messing with it :)

asciicast

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

model: Rename directory property to location
4 participants