Fix #353

The two primary cases fixed are: Ms. Marvel spider-man/deadpool The first issue removed 'Ms.' which is a problem as many comics have series that the only difference in the title is the designation/honorific. The second issue is that the '/' was removed and not replaced with anything causing a search for 'mandeadpool' which will not show useful results. Consequently all designations/honorifics are now untouched All punctuation is replaced with a space
Update dependencies
2022-08-12 07:10:36 -07:00 · 2022-08-10 20:55:46 -07:00 · 2022-08-10 16:33:40 -07:00 · 2022-08-08 19:10:57 -07:00 · 2022-08-08 19:03:25 -07:00 · 2022-08-08 18:05:06 -07:00
27 changed files with 318 additions and 140 deletions
--- a/.github/workflows/build.yaml
+++ b/.github/workflows/build.yaml
@ -84,7 +84,7 @@ jobs:
      - name: Build and install PyPi packages
        run: |
          make clean pydist
-          python -m pip install "dist/$(python setup.py --fullname)-py3-none-any.whl[GUI,CBR]"
+          python -m pip install "dist/$(python setup.py --fullname)-py3-none-any.whl[all]"

      - name: build
        run: |
--- a/.github/workflows/package.yaml
+++ b/.github/workflows/package.yaml
@ -45,7 +45,7 @@ jobs:
      - name: Build, Install and Test PyPi packages
        run: |
          make clean pydist
-          python -m pip install "dist/$(python setup.py --fullname)-py3-none-any.whl[GUI,CBR]"
+          python -m pip install "dist/$(python setup.py --fullname)-py3-none-any.whl[all]"
          echo "CT_FULL_NAME=$(python setup.py --fullname)" >> $GITHUB_ENV
          python -m flake8
          python -m pytest
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@ -1,7 +1,7 @@
 exclude: ^scripts
 repos:
 -   repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v4.2.0
+    rev: v4.3.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
@ -10,7 +10,7 @@ repos:
    -   id: name-tests-test
    -   id: requirements-txt-fixer
 -   repo: https://github.com/asottile/setup-cfg-fmt
-    rev: v1.20.1
+    rev: v2.0.0
    hooks:
    -   id: setup-cfg-fmt
 -   repo: https://github.com/PyCQA/isort
@ -19,12 +19,12 @@ repos:
    -   id: isort
        args: [--af,--add-import, 'from __future__ import annotations']
 -   repo: https://github.com/asottile/pyupgrade
-    rev: v2.32.1
+    rev: v2.37.3
    hooks:
    -   id: pyupgrade
        args: [--py39-plus]
 -   repo: https://github.com/psf/black
-    rev: 22.3.0
+    rev: 22.6.0
    hooks:
    -   id: black
 -   repo: https://github.com/PyCQA/autoflake
@ -33,12 +33,12 @@ repos:
    -   id: autoflake
        args: [-i]
 -   repo: https://github.com/PyCQA/flake8
-    rev: 4.0.1
+    rev: 5.0.4
    hooks:
    -   id: flake8
-        additional_dependencies: [flake8-encodings, flake8-warnings, flake8-builtins, flake8-eradicate, flake8-length, flake8-print]
+        additional_dependencies: [flake8-encodings, flake8-warnings, flake8-builtins, flake8-length, flake8-print]
 -   repo: https://github.com/pre-commit/mirrors-mypy
-    rev: v0.960
+    rev: v0.971
    hooks:
    -   id: mypy
        additional_dependencies: [types-setuptools, types-requests]
--- a/comicapi/comicarchive.py
+++ b/comicapi/comicarchive.py
@ -114,7 +114,7 @@ class SevenZipArchiver(UnknownArchiver):
        return False

    def read_file(self, archive_file: str) -> bytes:
-        data = bytes()
+        data = b""
        try:
            with py7zr.SevenZipFile(self.path, "r") as zf:
                data = zf.read(archive_file)[archive_file].read()
@ -422,7 +422,7 @@ class RarArchiver(UnknownArchiver):

        rarc = self.get_rar_obj()
        if rarc is None:
-            return bytes()
+            return b""

        tries = 0
        while tries < 7:
@ -665,7 +665,7 @@ class FolderArchiver(UnknownArchiver):


 class ComicArchive:
-    logo_data = bytes()
+    logo_data = b""

    class ArchiveType:
        SevenZip, Zip, Rar, Folder, Pdf, Unknown = list(range(6))
@ -750,7 +750,7 @@ class ComicArchive:
        if new_path == self.path:
            return
        os.makedirs(new_path.parent, 0o777, True)
-        shutil.move(path, new_path)
+        shutil.move(self.path, new_path)
        self.path = new_path
        self.archiver.path = pathlib.Path(path)

@ -853,13 +853,13 @@ class ComicArchive:
        return retcode

    def get_page(self, index: int) -> bytes:
-        image_data = bytes()
+        image_data = b""

        filename = self.get_page_name(index)

        if filename:
            try:
-                image_data = self.archiver.read_file(filename) or bytes()
+                image_data = self.archiver.read_file(filename) or b""
            except Exception:
                logger.error("Error reading in page %d. Substituting logo page.", index)
                image_data = ComicArchive.logo_data
@ -1033,7 +1033,7 @@ class ComicArchive:
            raw_cix = self.archiver.read_file(self.ci_xml_filename) or b""
        except Exception as e:
            logger.error("Error reading in raw CIX! for %s: %s", self.path, e)
-            raw_cix = bytes()
+            raw_cix = b""
        return raw_cix

    def write_cix(self, metadata: GenericMetadata) -> bool:
--- a/comicapi/comicbookinfo.py
+++ b/comicapi/comicbookinfo.py
@ -97,7 +97,14 @@ class ComicBookInfo:
        metadata.country = utils.xlate(cbi["country"])
        metadata.critical_rating = utils.xlate(cbi["rating"], True)

-        metadata.credits = cbi["credits"]
+        metadata.credits = [
+            Credits(
+                person=x["person"] if "person" in x else "",
+                role=x["role"] if "role" in x else "",
+                primary=x["primary"] if "primary" in x else False,
+            )
+            for x in cbi["credits"]
+        ]
        metadata.tags = set(cbi["tags"]) if cbi["tags"] is not None else set()

        # make sure credits and tags are at least empty lists and not None
--- a/comicapi/filenameparser.py
+++ b/comicapi/filenameparser.py
@ -24,7 +24,8 @@ import logging
 import os
 import re
 from operator import itemgetter
-from typing import Callable, Match, TypedDict
+from re import Match
+from typing import Callable, TypedDict
 from urllib.parse import unquote

 from text2digits import text2digits
@ -67,17 +68,10 @@ class FileNameParser:

        # replace any name separators with spaces
        tmpstr = self.fix_spaces(filename)
-        found = False

-        match = re.search(r"(?<=\sof\s)\d+(?=\s)", tmpstr, re.IGNORECASE)
+        match = re.search(r"(?:\s\(?of\s)(\d+)(?: |\))", tmpstr, re.IGNORECASE)
        if match:
-            count = match.group()
-            found = True
-
-        if not found:
-            match = re.search(r"(?<=\(of\s)\d+(?=\))", tmpstr, re.IGNORECASE)
-            if match:
-                count = match.group()
+            count = match.group(1)

        return count.lstrip("0")

@ -180,10 +174,12 @@ class FileNameParser:
        if "--" in filename:
            # the pattern seems to be that anything to left of the first "--" is the series name followed by issue
            filename = re.sub(r"--.*", self.repl, filename)
+            # never happens

        elif "__" in filename:
            # the pattern seems to be that anything to left of the first "__" is the series name followed by issue
            filename = re.sub(r"__.*", self.repl, filename)
+            # never happens

        filename = filename.replace("+", " ")
        tmpstr = self.fix_spaces(filename, remove_dashes=False)
@ -425,6 +421,7 @@ def parse(p: Parser) -> Callable[[Parser], Callable | None] | None:
        likely_year = False
        if p.firstItem and p.first_is_alt:
            p.alt = True
+            p.firstItem = False
            return parse_issue_number

        # The issue number should hopefully not be in parentheses
@ -440,9 +437,6 @@ def parse(p: Parser) -> Callable[[Parser], Callable | None] | None:
                            # Series has already been started/parsed,
                            # filters out leading alternate numbers leading alternate number
                            if len(p.series_parts) > 0:
-                                # Unset first item
-                                if p.firstItem:
-                                    p.firstItem = False
                                return parse_issue_number
            else:
                p.operator_rejected.append(item)
@ -506,9 +500,6 @@ def parse(p: Parser) -> Callable[[Parser], Callable | None] | None:
            p.series_parts.append(item)
            p.used_items.append(item)

-            # Unset first item
-            if p.firstItem:
-                p.firstItem = False
            p.get()
            return parse_series

@ -590,6 +581,8 @@ def parse(p: Parser) -> Callable[[Parser], Callable | None] | None:
        if series_append:
            p.series_parts.append(item)
            p.used_items.append(item)
+            if p.firstItem:
+                p.firstItem = False
            return parse_series

    # We found text, it's probably the title or series
@ -602,6 +595,8 @@ def parse(p: Parser) -> Callable[[Parser], Callable | None] | None:

    # Usually the word 'of' eg 1 (of 6)
    elif item.typ == filenamelexer.ItemType.InfoSpecifier:
+        if p.firstItem:
+            p.firstItem = False
        return parse_info_specifier

    # Operator is a symbol that acts as some sort of separator eg - : ;
@ -622,9 +617,13 @@ def parse(p: Parser) -> Callable[[Parser], Callable | None] | None:
                p.irrelevant.extend([item, p.input[p.pos], p.get()])
            else:
                p.backup()
+                if p.firstItem:
+                    p.firstItem = False
                return parse_series
        # This is text that just happens to also be a month/day
        else:
+            if p.firstItem:
+                p.firstItem = False
            return parse_series

    # Specifically '__' or '--', no further title/series parsing is done to keep compatibility with wiki
@ -853,10 +852,11 @@ def resolve_year(p: Parser) -> None:
            p.used_items.append(vol)

            # Remove volume from series and title
-            if selected_year in p.series_parts:
-                p.series_parts.remove(selected_year)
-            if selected_year in p.title_parts:
-                p.title_parts.remove(selected_year)
+            # note: this never happens
+            if vol in p.series_parts:
+                p.series_parts.remove(vol)
+            if vol in p.title_parts:
+                p.title_parts.remove(vol)

        # Remove year from series and title
        if selected_year in p.series_parts:
--- a/comicapi/genericmetadata.py
+++ b/comicapi/genericmetadata.py
@ -93,7 +93,7 @@ class GenericMetadata:
    comments: str | None = None  # use same way as Summary in CIX

    volume_count: int | None = None
-    critical_rating: float | None = None  # rating in cbl; CommunityRating in CIX
+    critical_rating: float | None = None  # rating in CBL; CommunityRating in CIX
    country: str | None = None

    alternate_series: str | None = None
@ -270,8 +270,10 @@ class GenericMetadata:
    def get_primary_credit(self, role: str) -> str:
        primary = ""
        for credit in self.credits:
+            if "role" not in credit or "person" not in credit:
+                continue
            if (primary == "" and credit["role"].casefold() == role.casefold()) or (
-                credit["role"].casefold() == role.casefold() and credit["primary"]
+                credit["role"].casefold() == role.casefold() and "primary" in credit and credit["primary"]
            ):
                primary = credit["person"]
        return primary
--- a/comicapi/utils.py
+++ b/comicapi/utils.py
@ -19,11 +19,11 @@ import json
 import logging
 import os
 import pathlib
-import re
 import unicodedata
 from collections import defaultdict
+from collections.abc import Mapping
 from shutil import which  # noqa: F401
-from typing import Any, Mapping
+from typing import Any

 import pycountry
 import thefuzz.fuzz
@ -57,22 +57,20 @@ def get_recursive_filelist(pathlist: list[str]) -> list[str]:

        if os.path.isdir(p):
            filelist.extend(x for x in glob.glob(f"{p}{os.sep}/**", recursive=True) if not os.path.isdir(x))
-        else:
-            filelist.append(p)
+        elif str(p) not in filelist:
+            filelist.append(str(p))

    return filelist


 def add_to_path(dirname: str) -> None:
-    if dirname is not None and dirname != "":
+    if dirname:
+        dirname = os.path.abspath(dirname)
+        paths = [os.path.normpath(x) for x in os.environ["PATH"].split(os.pathsep)]

-        # verify that path doesn't already contain the given dirname
-        tmpdirname = re.escape(dirname)
-        pattern = r"(^|{sep}){dir}({sep}|$)".format(dir=tmpdirname, sep=os.pathsep)
-
-        match = re.search(pattern, os.environ["PATH"])
-        if not match:
-            os.environ["PATH"] = dirname + os.pathsep + os.environ["PATH"]
+        if dirname not in paths:
+            paths.insert(0, dirname)
+            os.environ["PATH"] = os.pathsep.join(paths)


 def xlate(data: Any, is_int: bool = False, is_float: bool = False) -> Any:
@ -123,13 +121,9 @@ def remove_articles(text: str) -> str:
        "the",
        "the",
        "with",
-        "ms",
-        "mrs",
-        "mr",
-        "dr",
    ]
    new_text = ""
-    for word in text.split(" "):
+    for word in text.split():
        if word not in articles:
            new_text += word + " "

@ -141,19 +135,16 @@ def remove_articles(text: str) -> str:
 def sanitize_title(text: str, basic: bool = False) -> str:
    # normalize unicode and convert to ascii. Does not work for everything eg ½ to 1⁄2 not 1/2
    text = unicodedata.normalize("NFKD", text).casefold()
-    if basic:
-        # comicvine keeps apostrophes a part of the word
-        text = text.replace("'", "")
-        text = text.replace('"', "")
-    else:
+    # comicvine keeps apostrophes a part of the word
+    text = text.replace("'", "")
+    text = text.replace('"', "")
+    if not basic:
        # comicvine ignores punctuation and accents
        # remove all characters that are not a letter, separator (space) or number
        # replace any "dash punctuation" with a space
        # makes sure that batman-superman and self-proclaimed stay separate words
        text = "".join(
-            c if not unicodedata.category(c) in ("Pd",) else " "
-            for c in text
-            if unicodedata.category(c)[0] in "LZN" or unicodedata.category(c) in ("Pd",)
+            c if unicodedata.category(c)[0] not in "P" else " " for c in text if unicodedata.category(c)[0] in "LZNP"
        )
        # remove extra space and articles and all lower case
        text = remove_articles(text).strip()
@ -161,10 +152,10 @@ def sanitize_title(text: str, basic: bool = False) -> str:
    return text


-def titles_match(search_title: str, record_title: str, threshold: int = 90) -> int:
+def titles_match(search_title: str, record_title: str, threshold: int = 90) -> bool:
    sanitized_search = sanitize_title(search_title)
    sanitized_record = sanitize_title(record_title)
-    ratio = thefuzz.fuzz.ratio(sanitized_search, sanitized_record)
+    ratio: int = thefuzz.fuzz.ratio(sanitized_search, sanitized_record)
    logger.debug(
        "search title: %s ; record title: %s ; ratio: %d ; match threshold: %d",
        search_title,
@ -205,6 +196,7 @@ def get_language_from_iso(iso: str | None) -> str | None:
 def get_language(string: str | None) -> str | None:
    if string is None:
        return None
+    string = string.casefold()

    lang = get_language_from_iso(string)

@ -217,8 +209,6 @@ def get_language(string: str | None) -> str | None:


 def get_publisher(publisher: str) -> tuple[str, str]:
-    if publisher is None:
-        return ("", "")
    imprint = ""

    for pub in publishers.values():
--- a/comictaggerlib/cli.py
+++ b/comictaggerlib/cli.py
@ -165,13 +165,13 @@ def post_process_matches(


 def cli_mode(opts: argparse.Namespace, settings: ComicTaggerSettings) -> None:
-    if len(opts.files) < 1:
+    if len(opts.file_list) < 1:
        logger.error("You must specify at least one filename.  Use the -h option for more info")
        return

    match_results = OnlineMatchResults()

-    for f in opts.files:
+    for f in opts.file_list:
        process_file_cli(f, opts, settings, match_results)
        sys.stdout.flush()

@ -212,7 +212,7 @@ def create_local_metadata(opts: argparse.Namespace, ca: ComicArchive, settings:
 def process_file_cli(
    filename: str, opts: argparse.Namespace, settings: ComicTaggerSettings, match_results: OnlineMatchResults
 ) -> None:
-    batch_mode = len(opts.files) > 1
+    batch_mode = len(opts.file_list) > 1

    ca = ComicArchive(filename, settings.rar_exe_path, ComicTaggerSettings.get_graphic("nocover.png"))

@ -500,15 +500,19 @@ def process_file_cli(

        try:
            new_name = renamer.determine_name(ext=new_ext)
-        except Exception:
+        except ValueError:
            logger.exception(
                msg_hdr + "Invalid format string!\n"
                "Your rename template is invalid!\n\n"
+                "%s\n\n"
                "Please consult the template help in the settings "
                "and the documentation on the format at "
-                "https://docs.python.org/3/library/string.html#format-string-syntax"
+                "https://docs.python.org/3/library/string.html#format-string-syntax",
+                settings.rename_template,
            )
            return
+        except Exception:
+            logger.exception("Formatter failure: %s metadata: %s", settings.rename_template, renamer.metadata)

        folder = get_rename_dir(ca, settings.rename_dir if settings.rename_move_dir else None)

--- a/comictaggerlib/comicvinetalker.py
+++ b/comictaggerlib/comicvinetalker.py
@ -21,6 +21,7 @@ import re
 import time
 from datetime import datetime
 from typing import Any, Callable, cast
+from urllib.parse import urlencode, urljoin, urlsplit

 import requests
 from bs4 import BeautifulSoup
@ -104,7 +105,13 @@ class ComicVineTalker:
        self.issue_id: int | None = None

        self.api_key = ComicVineTalker.api_key or default_api_key
-        self.api_base_url = ComicVineTalker.api_base_url or default_url
+        tmp_url = urlsplit(ComicVineTalker.api_base_url or default_url)
+
+        # joinurl only works properly if there is a trailing slash
+        if tmp_url.path and tmp_url.path[-1] != "/":
+            tmp_url = tmp_url._replace(path=tmp_url.path + "/")
+
+        self.api_base_url = tmp_url.geturl()

        self.log_func: Callable[[str], None] | None = None

@ -127,10 +134,16 @@ class ComicVineTalker:
        if not url:
            url = self.api_base_url
        try:
-            test_url = url + "/issue/1/?api_key=" + key + "&format=json&field_list=name"
+            test_url = urljoin(url, "issue/1/")

            cv_response: resulttypes.CVResult = requests.get(
-                test_url, headers={"user-agent": "comictagger/" + ctversion.version}
+                test_url,
+                headers={"user-agent": "comictagger/" + ctversion.version},
+                params={
+                    "api_key": key,
+                    "format": "json",
+                    "field_list": "name",
+                },
            ).json()

            # Bogus request, but if the key is wrong, you get error 100: "Invalid API Key"
@ -171,8 +184,8 @@ class ComicVineTalker:

    def get_url_content(self, url: str, params: dict[str, Any]) -> Any:
        # connect to server:
-        #  if there is a 500 error, try a few more times before giving up
-        #  any other error, just bail
+        # if there is a 500 error, try a few more times before giving up
+        # any other error, just bail
        for tries in range(3):
            try:
                resp = requests.get(url, params=params, headers={"user-agent": "comictagger/" + ctversion.version})
@ -186,10 +199,15 @@ class ComicVineTalker:
                    break

            except requests.exceptions.RequestException as e:
-                self.write_log(str(e) + "\n")
+                self.write_log(f"{e}\n")
                raise ComicVineTalkerException(ComicVineTalkerException.Network, "Network Error!") from e
+            except json.JSONDecodeError as e:
+                self.write_log(f"{e}\n")
+                raise ComicVineTalkerException(ComicVineTalkerException.Unknown, "ComicVine did not provide json")

-        raise ComicVineTalkerException(ComicVineTalkerException.Unknown, "Error on Comic Vine server")
+        raise ComicVineTalkerException(
+            ComicVineTalkerException.Unknown, f"Error on Comic Vine server: {resp.status_code}"
+        )

    def search_for_series(
        self,
@ -222,7 +240,7 @@ class ComicVineTalker:
            "limit": 100,
        }

-        cv_response = self.get_cv_content(self.api_base_url + "/search", params)
+        cv_response = self.get_cv_content(urljoin(self.api_base_url, "search"), params)

        search_results: list[resulttypes.CVVolumeResults] = []

@ -268,7 +286,7 @@ class ComicVineTalker:
            page += 1

            params["page"] = page
-            cv_response = self.get_cv_content(self.api_base_url + "/search", params)
+            cv_response = self.get_cv_content(urljoin(self.api_base_url, "search"), params)

            search_results.extend(cast(list[resulttypes.CVVolumeResults], cv_response["results"]))
            current_result_count += cv_response["number_of_page_results"]
@ -291,7 +309,7 @@ class ComicVineTalker:
        if cached_volume_result is not None:
            return cached_volume_result

-        volume_url = self.api_base_url + "/volume/" + CVTypeID.Volume + "-" + str(series_id)
+        volume_url = urljoin(self.api_base_url, f"volume/{CVTypeID.Volume}-{series_id}")

        params = {
            "api_key": self.api_key,
@ -317,12 +335,12 @@ class ComicVineTalker:

        params = {
            "api_key": self.api_key,
-            "filter": "volume:" + str(series_id),
+            "filter": f"volume:{series_id}",
            "format": "json",
            "field_list": "id,volume,issue_number,name,image,cover_date,site_detail_url,description,aliases",
            "offset": 0,
        }
-        cv_response = self.get_cv_content(self.api_base_url + "/issues/", params)
+        cv_response = self.get_cv_content(urljoin(self.api_base_url, "issues/"), params)

        current_result_count = cv_response["number_of_page_results"]
        total_result_count = cv_response["number_of_total_results"]
@ -337,7 +355,7 @@ class ComicVineTalker:
            offset += cv_response["number_of_page_results"]

            params["offset"] = offset
-            cv_response = self.get_cv_content(self.api_base_url + "/issues/", params)
+            cv_response = self.get_cv_content(urljoin(self.api_base_url, "issues/"), params)

            volume_issues_result.extend(cast(list[resulttypes.CVIssuesResults], cv_response["results"]))
            current_result_count += cv_response["number_of_page_results"]
@ -367,7 +385,7 @@ class ComicVineTalker:
            "filter": flt,
        }

-        cv_response = self.get_cv_content(self.api_base_url + "/issues/", params)
+        cv_response = self.get_cv_content(urljoin(self.api_base_url, "issues/"), params)

        current_result_count = cv_response["number_of_page_results"]
        total_result_count = cv_response["number_of_total_results"]
@ -382,7 +400,7 @@ class ComicVineTalker:
            offset += cv_response["number_of_page_results"]

            params["offset"] = offset
-            cv_response = self.get_cv_content(self.api_base_url + "/issues/", params)
+            cv_response = self.get_cv_content(urljoin(self.api_base_url, "issues/"), params)

            filtered_issues_result.extend(cast(list[resulttypes.CVIssuesResults], cv_response["results"]))
            current_result_count += cv_response["number_of_page_results"]
@ -407,7 +425,7 @@ class ComicVineTalker:
                break

        if f_record is not None:
-            issue_url = self.api_base_url + "/issue/" + CVTypeID.Issue + "-" + str(f_record["id"])
+            issue_url = urljoin(self.api_base_url, f"issue/{CVTypeID.Issue}-{f_record['id']}")
            params = {"api_key": self.api_key, "format": "json"}
            cv_response = self.get_cv_content(issue_url, params)
            issue_results = cast(resulttypes.CVIssueDetailResults, cv_response["results"])
@ -420,7 +438,7 @@ class ComicVineTalker:

    def fetch_issue_data_by_issue_id(self, issue_id: int, settings: ComicTaggerSettings) -> GenericMetadata:

-        issue_url = self.api_base_url + "/issue/" + CVTypeID.Issue + "-" + str(issue_id)
+        issue_url = urljoin(self.api_base_url, f"issue/{CVTypeID.Issue}-{issue_id}")
        params = {"api_key": self.api_key, "format": "json"}
        cv_response = self.get_cv_content(issue_url, params)

@ -609,7 +627,8 @@ class ComicVineTalker:
        if cached_details["image_url"] is not None:
            return cached_details

-        issue_url = self.api_base_url + "/issue/" + CVTypeID.Issue + "-" + str(issue_id)
+        issue_url = urljoin(self.api_base_url, f"issue/{CVTypeID.Issue}-{issue_id}")
+        logger.error("%s, %s", self.api_base_url, issue_url)

        params = {"api_key": self.api_key, "format": "json", "field_list": "image,cover_date,site_detail_url"}

@ -708,19 +727,20 @@ class ComicVineTalker:
            ComicVineTalker.url_fetch_complete(details["image_url"], details["thumb_image_url"])
            return

-        issue_url = (
-            self.api_base_url
-            + "/issue/"
-            + CVTypeID.Issue
-            + "-"
-            + str(issue_id)
-            + "/?api_key="
-            + self.api_key
-            + "&format=json&field_list=image,cover_date,site_detail_url"
+        issue_url = urlsplit(self.api_base_url)
+        issue_url = issue_url._replace(
+            query=urlencode(
+                {
+                    "api_key": self.api_key,
+                    "format": "json",
+                    "field_list": "image,cover_date,site_detail_url",
+                }
+            ),
+            path=f"issue/{CVTypeID.Issue}-{issue_id}",
        )

        self.nam.finished.connect(self.async_fetch_issue_cover_url_complete)
-        self.nam.get(QtNetwork.QNetworkRequest(QtCore.QUrl(issue_url)))
+        self.nam.get(QtNetwork.QNetworkRequest(QtCore.QUrl(issue_url.geturl())))

    def async_fetch_issue_cover_url_complete(self, reply: QtNetwork.QNetworkReply) -> None:
        # read in the response
--- a/comictaggerlib/coverimagewidget.py
+++ b/comictaggerlib/coverimagewidget.py
@ -113,7 +113,7 @@ class CoverImageWidget(QtWidgets.QWidget):
        self.page_loader = None
        self.imageIndex = -1
        self.imageCount = 1
-        self.imageData = bytes()
+        self.imageData = b""

        self.btnLeft.setIcon(QtGui.QIcon(ComicTaggerSettings.get_graphic("left.png")))
        self.btnRight.setIcon(QtGui.QIcon(ComicTaggerSettings.get_graphic("right.png")))
@ -136,7 +136,7 @@ class CoverImageWidget(QtWidgets.QWidget):
        self.page_loader = None
        self.imageIndex = -1
        self.imageCount = 1
-        self.imageData = bytes()
+        self.imageData = b""

    def clear(self) -> None:
        self.reset_widget()
--- a/comictaggerlib/filerenamer.py
+++ b/comictaggerlib/filerenamer.py
@ -109,11 +109,12 @@ class MetadataFormatter(string.Formatter):

                # format the object and append to the result
                fmt_obj = self.format_field(obj, format_spec)
-                if fmt_obj == "" and len(result) > 0 and self.smart_cleanup and literal_text:
-                    lstrip = True
+                if fmt_obj == "" and result and self.smart_cleanup and literal_text:
+                    if self.str_contains(result[-1], "({["):
+                        lstrip = True
                    if result:
                        if " " in result[-1]:
-                            result[-1], _, _ = result[-1].rpartition(" ")
+                            result[-1], _, _ = result[-1].rstrip().rpartition(" ")
                        result[-1] = result[-1].rstrip("-_({[#")
                if self.smart_cleanup:
                    fmt_obj = " ".join(fmt_obj.split())
@ -122,6 +123,12 @@ class MetadataFormatter(string.Formatter):

        return "".join(result), False

+    def str_contains(self, chars: str, string: str) -> bool:
+        for char in chars:
+            if char in string:
+                return True
+        return False
+

 class FileRenamer:
    def __init__(self, metadata: GenericMetadata | None, platform: str = "auto") -> None:
--- a/comictaggerlib/imagefetcher.py
+++ b/comictaggerlib/imagefetcher.py
@ -97,14 +97,14 @@ class ImageFetcher:
            # if we found it, just emit the signal asap
            if image_data:
                ImageFetcher.image_fetch_complete(QtCore.QByteArray(image_data))
-                return bytes()
+                return b""

            # didn't find it.  look online
            self.nam.finished.connect(self.finish_request)
            self.nam.get(QtNetwork.QNetworkRequest(QtCore.QUrl(url)))

            # we'll get called back when done...
-        return bytes()
+        return b""

    def finish_request(self, reply: QtNetwork.QNetworkReply) -> None:
        # read in the image data
@ -159,10 +159,10 @@ class ImageFetcher:
            row = cur.fetchone()

            if row is None:
-                return bytes()
+                return b""

            filename = row[0]
-            image_data = bytes()
+            image_data = b""

            try:
                with open(filename, "rb") as f:
--- a/comictaggerlib/issueidentifier.py
+++ b/comictaggerlib/issueidentifier.py
@ -157,7 +157,7 @@ class IssueIdentifier:
            cropped_im = im.crop((int(w / 2), 0, w, h))
        except Exception:
            logger.exception("cropCover() error")
-            return bytes()
+            return b""

        output = io.BytesIO()
        cropped_im.save(output, format="PNG")
--- a/comictaggerlib/options.py
+++ b/comictaggerlib/options.py
@ -405,6 +405,8 @@ def parse_cmd_line() -> argparse.Namespace:
        opts.copy = opts.copy[0]

    if opts.recursive:
-        opts.file_list = utils.get_recursive_filelist(opts.file_list)
+        opts.file_list = utils.get_recursive_filelist(opts.files)
+    else:
+        opts.file_list = opts.files

    return opts
--- a/comictaggerlib/renamewindow.py
+++ b/comictaggerlib/renamewindow.py
@ -97,7 +97,8 @@ class RenameWindow(QtWidgets.QDialog):
            new_ext = self.config_renamer(ca)
            try:
                new_name = self.renamer.determine_name(new_ext)
-            except Exception as e:
+            except ValueError as e:
+                logger.exception("Invalid format string: %s", self.settings.rename_template)
                QtWidgets.QMessageBox.critical(
                    self,
                    "Invalid format string!",
@ -109,6 +110,19 @@ class RenameWindow(QtWidgets.QDialog):
                    "https://docs.python.org/3/library/string.html#format-string-syntax</a>",
                )
                return
+            except Exception as e:
+                logger.exception(
+                    "Formatter failure: %s metadata: %s", self.settings.rename_template, self.renamer.metadata
+                )
+                QtWidgets.QMessageBox.critical(
+                    self,
+                    "The formatter had an issue!",
+                    "The formatter has experienced an unexpected error!"
+                    f"<br/><br/>{type(e).__name__}: {e}<br/><br/>"
+                    "Please open an issue at "
+                    "<a href='https://github.com/comictagger/comictagger'>"
+                    "https://github.com/comictagger/comictagger</a>",
+                )

            row = self.twList.rowCount()
            self.twList.insertRow(row)
--- a/comictaggerlib/settings.py
+++ b/comictaggerlib/settings.py
@ -22,7 +22,8 @@ import pathlib
 import platform
 import sys
 import uuid
-from typing import Iterator, TextIO, no_type_check
+from collections.abc import Iterator
+from typing import TextIO, no_type_check

 from comicapi import utils

--- a/comictaggerlib/settingswindow.py
+++ b/comictaggerlib/settingswindow.py
@ -269,17 +269,32 @@ class SettingsWindow(QtWidgets.QDialog):
    def accept(self) -> None:
        self.rename_test()
        if self.rename_error is not None:
-            QtWidgets.QMessageBox.critical(
-                self,
-                "Invalid format string!",
-                "Your rename template is invalid!"
-                f"<br/><br/>{self.rename_error}<br/><br/>"
-                "Please consult the template help in the "
-                "settings and the documentation on the format at "
-                "<a href='https://docs.python.org/3/library/string.html#format-string-syntax'>"
-                "https://docs.python.org/3/library/string.html#format-string-syntax</a>",
-            )
-            return
+            if isinstance(self.rename_error, ValueError):
+                logger.exception("Invalid format string: %s", self.settings.rename_template)
+                QtWidgets.QMessageBox.critical(
+                    self,
+                    "Invalid format string!",
+                    "Your rename template is invalid!"
+                    f"<br/><br/>{self.rename_error}<br/><br/>"
+                    "Please consult the template help in the "
+                    "settings and the documentation on the format at "
+                    "<a href='https://docs.python.org/3/library/string.html#format-string-syntax'>"
+                    "https://docs.python.org/3/library/string.html#format-string-syntax</a>",
+                )
+                return
+            else:
+                logger.exception(
+                    "Formatter failure: %s metadata: %s", self.settings.rename_template, self.renamer.metadata
+                )
+                QtWidgets.QMessageBox.critical(
+                    self,
+                    "The formatter had an issue!",
+                    "The formatter has experienced an unexpected error!"
+                    f"<br/><br/>{type(self.rename_error).__name__}: {self.rename_error}<br/><br/>"
+                    "Please open an issue at "
+                    "<a href='https://github.com/comictagger/comictagger'>"
+                    "https://github.com/comictagger/comictagger</a>",
+                )

        # Copy values from form to settings and save
        self.settings.rar_exe_path = str(self.leRarExePath.text())
--- a/comictaggerlib/taggerwindow.py
+++ b/comictaggerlib/taggerwindow.py
@ -26,7 +26,8 @@ import pprint
 import re
 import sys
 import webbrowser
-from typing import Any, Callable, Iterable, cast
+from collections.abc import Iterable
+from typing import Any, Callable, cast
 from urllib.parse import urlparse

 import natsort
@ -1854,7 +1855,7 @@ Have fun!
                logger.error("Failed to load metadata for %s: %s", ca.path, e)
            image_data = ca.get_page(cover_idx)
            self.atprogdialog.set_archive_image(image_data)
-            self.atprogdialog.set_test_image(bytes())
+            self.atprogdialog.set_test_image(b"")

            QtCore.QCoreApplication.processEvents()
            if self.atprogdialog.isdone:
--- a/requirements-speedup.txt
+++ b/requirements-speedup.txt
@ -0,0 +1 @@
+thefuzz[speedup]>=0.19.0
--- a/requirements.txt
+++ b/requirements.txt
@ -2,11 +2,11 @@ beautifulsoup4 >= 4.1
 importlib_metadata
 natsort>=8.1.0
 pathvalidate
-pillow>=4.3.0
+pillow>=9.1.0
 py7zr
 pycountry
 requests==2.*
 text2digits
-thefuzz[speedup]>=0.19.0
+thefuzz>=0.19.0
 typing_extensions
 wordninja
--- a/testing/comicdata.py
+++ b/testing/comicdata.py
@ -99,14 +99,22 @@ metadata_keys = [
 ]

 credits = [
-    ("writer", "Dara Naraghi"),
-    ("writeR", "Dara Naraghi"),
+    (comicapi.genericmetadata.md_test, "writer", "Dara Naraghi"),
+    (comicapi.genericmetadata.md_test, "writeR", "Dara Naraghi"),
+    (
+        comicapi.genericmetadata.md_test.replace(
+            credits=[{"person": "Dara Naraghi", "role": "writer"}, {"person": "Dara Naraghi", "role": "writer"}]
+        ),
+        "writeR",
+        "Dara Naraghi",
+    ),
 ]

 imprints = [
    ("marvel", ("", "Marvel")),
    ("marvel comics", ("", "Marvel")),
    ("aircel", ("Aircel Comics", "Marvel")),
+    ("nothing", ("", "nothing")),
 ]

 additional_imprints = [
--- a/testing/filenames.py
+++ b/testing/filenames.py
@ -733,6 +733,20 @@ fnames = [
        },
        True,
    ),
+    (
+        "Cory Doctorow's Futuristic Tales of the Here and Now: Anda's Game #001 (2007).cbz",
+        "full-date, issue in parenthesis",
+        {
+            "issue": "1",
+            "series": "Cory Doctorow's Futuristic Tales of the Here and Now",
+            "title": "Anda's Game",
+            "volume": "",
+            "year": "2007",
+            "remainder": "",
+            "issue_count": "",
+        },
+        True,
+    ),
 ]

 rnames = [
@ -795,7 +809,7 @@ rnames = [
    (
        r"{publisher}\  {series} #{issue} - {title} ({year})",  # backslashes separate directories
        False,
-        "universal",
+        "Linux",
        "Cory Doctorow's Futuristic Tales of the Here and Now #001 - Anda's Game (2007).cbz",
        does_not_raise(),
    ),
@ -807,10 +821,10 @@ rnames = [
        does_not_raise(),
    ),
    (
-        "{series} #  {issue} - {locations} ({year})",
+        "{series} #{issue} - {locations} ({year})",
        False,
        "universal",
-        "Cory Doctorow's Futuristic Tales of the Here and Now # 001 - lonely cottage (2007).cbz",
+        "Cory Doctorow's Futuristic Tales of the Here and Now #001 - lonely cottage (2007).cbz",
        does_not_raise(),
    ),
    (
@ -848,6 +862,20 @@ rnames = [
        "Cory Doctorow's Futuristic Tales of the Here and Now - Anda's Game {test} #001 (2007).cbz",
        does_not_raise(),
    ),
+    (
+        "{series} - {title} #{issue} ({year} {price})",  # Test null value in parenthesis with a non-null value
+        False,
+        "universal",
+        "Cory Doctorow's Futuristic Tales of the Here and Now - Anda's Game #001 (2007).cbz",
+        does_not_raise(),
+    ),
+    (
+        "{series} - {title} #{issue} (of {price})",  # null value with literal text in parenthesis
+        False,
+        "universal",
+        "Cory Doctorow's Futuristic Tales of the Here and Now - Anda's Game #001.cbz",
+        does_not_raise(),
+    ),
    (
        "{series} - {title} {1} #{issue} ({year})",  # Test numeric key
        False,
--- a/tests/comicarchive_test.py
+++ b/tests/comicarchive_test.py
@ -88,3 +88,11 @@ def test_copy_from_archive(archiver, tmp_path, cbz):

    md = comic_archive.read_cix()
    assert md == comicapi.genericmetadata.md_test
+
+
+def test_rename(tmp_comic, tmp_path):
+    old_path = tmp_comic.path
+    tmp_comic.rename(tmp_path / "test.cbz")
+    assert not old_path.exists()
+    assert tmp_comic.path.exists()
+    assert tmp_comic.path != old_path
--- a/tests/conftest.py
+++ b/tests/conftest.py
@ -5,7 +5,8 @@ import datetime
 import io
 import shutil
 import unittest.mock
-from typing import Any, Generator
+from collections.abc import Generator
+from typing import Any

 import pytest
 import requests
--- a/tests/genericmetadata_test.py
+++ b/tests/genericmetadata_test.py
@ -37,6 +37,6 @@ def test_add_credit_primary():
    assert md.credits == [comicapi.genericmetadata.CreditMetadata(person="test", role="writer", primary=True)]


-@pytest.mark.parametrize("role, expected", credits)
+@pytest.mark.parametrize("md, role, expected", credits)
 def test_get_primary_credit(md, role, expected):
    assert md.get_primary_credit(role) == expected
--- a/tests/utils_test.py
+++ b/tests/utils_test.py
@ -1,5 +1,7 @@
 from __future__ import annotations

+import os
+
 import pytest

 import comicapi.utils
@ -22,13 +24,16 @@ def test_recursive_list_with_file(tmp_path) -> None:
    temp_txt = tmp_path / "info.txt"
    temp_txt.write_text("this is here")

-    expected_result = {str(foo_png), str(temp_cbr), str(temp_file), str(temp_txt)}
-    result = set(comicapi.utils.get_recursive_filelist([tmp_path]))
+    temp_txt2 = tmp_path / "info2.txt"
+    temp_txt2.write_text("this is here")
+
+    expected_result = {str(foo_png), str(temp_cbr), str(temp_file), str(temp_txt), str(temp_txt2)}
+    result = set(comicapi.utils.get_recursive_filelist([str(temp_txt2), tmp_path]))

    assert result == expected_result


-values = [
+xlate_values = [
    ({"data": "", "is_int": False, "is_float": False}, None),
    ({"data": None, "is_int": False, "is_float": False}, None),
    ({"data": None, "is_int": True, "is_float": False}, None),
@ -52,6 +57,70 @@ values = [
 ]


-@pytest.mark.parametrize("value, result", values)
+@pytest.mark.parametrize("value, result", xlate_values)
 def test_xlate(value, result):
    assert comicapi.utils.xlate(**value) == result
+
+
+language_values = [
+    ("en", "English"),
+    ("EN", "English"),
+    ("En", "English"),
+    ("", None),
+    (None, None),
+]
+
+
+@pytest.mark.parametrize("value, result", language_values)
+def test_get_language(value, result):
+    assert result == comicapi.utils.get_language(value)
+
+
+def test_unique_file(tmp_path):
+    file = tmp_path / "test"
+    assert file == comicapi.utils.unique_file(file)
+
+    file.mkdir()
+    assert (tmp_path / "test (1)") == comicapi.utils.unique_file(file)
+
+
+def test_add_to_path(monkeypatch):
+    monkeypatch.setenv("PATH", os.path.abspath("/usr/bin"))
+    comicapi.utils.add_to_path("/bin")
+    assert os.environ["PATH"] == (os.path.abspath("/bin") + os.pathsep + os.path.abspath("/usr/bin"))
+
+    comicapi.utils.add_to_path("/usr/bin")
+    comicapi.utils.add_to_path("/usr/bin/")
+    assert os.environ["PATH"] == (os.path.abspath("/bin") + os.pathsep + os.path.abspath("/usr/bin"))
+
+
+titles = [
+    (("", ""), True),
+    (("Conan el Barbaro", "Conan el Bárbaro"), True),
+    (("鋼の錬金術師", "鋼の錬金術師"), True),
+    (("钢之炼金术师", "鋼の錬金術師"), False),
+    (("batmans grave", "The Batman's Grave"), True),
+    (("batman grave", "The Batman's Grave"), True),
+    (("bats grave", "The Batman's Grave"), False),
+]
+
+
+@pytest.mark.parametrize("value, result", titles)
+def test_titles_match(value, result):
+    assert comicapi.utils.titles_match(value[0], value[1]) == result
+
+
+titles_2 = [
+    ("", ""),
+    ("鋼の錬金術師", "鋼の錬金術師"),
+    ("Conan el Bárbaro", "Conan el Barbaro"),
+    ("The Batman's Grave", "batmans grave"),
+    ("A+X", "ax"),
+    ("ms. marvel", "ms marvel"),
+    ("spider-man/deadpool", "spider man deadpool"),
+]
+
+
+@pytest.mark.parametrize("value, result", titles_2)
+def test_sanitize_title(value, result):
+    assert comicapi.utils.sanitize_title(value) == result.casefold()
Author	SHA1	Message	Date
Timmy Welch	be983c61bc	Fix #353 The two primary cases fixed are: Ms. Marvel spider-man/deadpool The first issue removed 'Ms.' which is a problem as many comics have series that the only difference in the title is the designation/honorific. The second issue is that the '/' was removed and not replaced with anything causing a search for 'mandeadpool' which will not show useful results. Consequently all designations/honorifics are now untouched All punctuation is replaced with a space	2022-08-12 07:10:36 -07:00
Timmy Welch	77a53a6834	Update dependencies Includes changes from pyupgrade	2022-08-10 20:55:46 -07:00
Timmy Welch	860a3147d2	Construct URL correctly	2022-08-10 16:33:40 -07:00
Timmy Welch	8ecb87fa26	Install all optional dependencies in CI	2022-08-08 19:10:57 -07:00
Timmy Welch	f17f560705	Fix tests on windows Make the speedup dependency to thefuzz optional it requires a C compiler	2022-08-08 19:03:25 -07:00
Timmy Welch	aadeb07c49	Fix issues Refactor add_to_path with tests Fix type hints for titles_match Use casefold in get_language Fix using the recursive flag in cli mode Add http status code to ComicVine exceptions Fix parenthesis getting removed when renaming Add more tests	2022-08-08 18:05:06 -07:00
Timmy Welch	e07fe9e8d1	Construct URLs more consistently	2022-07-29 22:05:22 -07:00
Timmy Welch	f2a68d6c8b	Fix rename and add test	2022-07-29 22:05:03 -07:00
Timmy Welch	94be266e17	Handle the 'primary' key missing in get_primary_credit Fixes #342 Add better exception handling for the formatter	2022-07-27 23:24:34 -07:00