New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

python-minifier

Package Overview
Dependencies
Maintainers
1
Versions
28
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

python-minifier - pypi Package Compare versions

Comparing version
3.0.0
to
3.1.0
+21
MANIFEST.in
include README.md
include LICENSE
include setup.py
graft src/python_minifier
graft test
exclude .gitignore
exclude tox.ini
exclude tox-windows.ini
exclude requirements-dev.txt
exclude CHANGELOG.md
prune .github
prune .config
prune docker
prune docs
prune corpus_test
prune hypo_test
prune typing_test
prune xtest
prune tox
global-exclude *.pyc
global-exclude __pycache__
# Renaming
We can save bytes by shortening the names used in a python program.
One simple way to do this is to replace each unique name in a module with a shorter one.
This will probably exhaust the available single character names, so is not as efficient as it could be.
Also, not all names can be safely changed this way.
By determining the scope of each name, we can assign the same short name to multiple non-overlapping scopes.
This means sibling namespaces may have the same names, and names will be shadowed in inner namespaces where possible.
This file is a guide for how the python_minifier package shortens names.
There are multiple steps to the renaming process.
## Binding Names
Names are bound to the local namespace it is defined in.
### Create namespace nodes
Namespaces in python are introduced by Modules, Functions, Comprehensions, Generators and Classes.
The AST node that introduces a new namespace is called a 'namespace node'.
These attributes are added to namespace nodes:
- Bindings - A list of Bindings local to this namespace, populated by the Bind names step
- Globals - A list of global names in this namespace
- Nonlocals - A list of nonlocal names in this namespace
### Determine parent node
Add a parent attribute to each node with the value of the node of which this is a child.
### Determine namespace
Add a namespace attribute to each node with the value of the namespace node that will be used for name binding and resolution.
This is usually the closest parent namespace node. The exceptions are:
- Function argument default values are in the same namespace as their function.
- Function decorators are in the same namespace as their function.
- Function annotations are in the same namespace as their function.
- Class decorator are in the same namespace as their class.
- Class bases, keywords, starargs and kwargs are in the same namespace as their class.
- The first iteration expression of a comprehension is in the same namespace as it's parent ListComp/SetComp/DictComp or GeneratorExp.
### Bind names
Every node that binds a name creates a NameBinding for that name in its namespace.
The node is added to the NameBinding as a reference.
If the name is nonlocal in its namespace it does not create a binding.
Nodes that create a binding:
- FunctionDef nodes bind their name
- ClassDef nodes bind their name
- arg nodes bind their arg
- Name nodes in Store or Del context bind their id
- MatchAs nodes bind their name
- MatchStar nodes bind their name
- MatchMapping nodes bind their rest
### Resolve names
For the remaining unbound name nodes and nodes that normally create a binding but are for a nonlocal name, we find their binding.
Bindings for name references are found by searching their namespace, then parent namespaces.
If a name is global in a searched namespace, skip straight to the module node.
If a name is nonlocal in a searched namespace, skip to the next parent namespace.
When traversing parent namespaces, Class namespaces are skipped.
If a NameBinding is found, add the node as a reference.
If no NameBinding is found, check if the name would resolve to a builtin.
If so, create a BuiltinBinding in the module namespace and add this node as a reference.
Otherwise we failed to find a binding for this name - Create a NameBinding in the module namespace and add this node
as a reference.
## Hoist Literals
At this point we do the HoistLiterals transform, which adds new HoistedLiteral bindings to the namespaces where it wants
to introduce new names.
## Name Assignment
Collect all bindings in the module and sort by estimated byte savings
For each binding:
- Determine it's 'reservation scope', which is the set of namespaces that name is referenced in (and all namespaces between them)
- Get the next available name that is unassigned and unreserved in all namespaces in the reservation scope.
- Check if we should proceed with the rename - is it space efficient to do this rename, or has the original name been assigned somewhere else?
- Rename the binding, rename all referenced nodes to the new name, and record this name as assigned in every namespace of the reservation scope.
"""
Template String (T-String) unparsing
T-strings in Python 3.14 follow PEP 750 and are based on PEP 701,
which means they don't have the quote restrictions of older f-strings.
This implementation is much simpler than f_string.py because:
- No quote tracking needed (PEP 701 benefits)
- No pep701 parameter needed (always true for t-strings)
- No Outer vs Inner distinction needed
- Always use all quote types
"""
import python_minifier.ast_compat as ast
from python_minifier import UnstableMinification
from python_minifier.ast_compare import CompareError, compare_ast
from python_minifier.expression_printer import ExpressionPrinter
from python_minifier.ministring import MiniString
from python_minifier.token_printer import TokenTypes
from python_minifier.util import is_constant_node
class TString(object):
"""
A Template String (t-string)
Much simpler than f-strings because PEP 701 eliminates quote restrictions
"""
def __init__(self, node):
assert isinstance(node, ast.TemplateStr)
self.node = node
# Always use all quotes - no restrictions due to PEP 701
self.allowed_quotes = ['"', "'", '"""', "'''"]
def is_correct_ast(self, code):
"""Check if the generated code produces the same AST"""
try:
c = ast.parse(code, 'TString candidate', mode='eval')
compare_ast(self.node, c.body)
return True
except Exception:
return False
def complete_debug_specifier(self, partial_specifier_candidates, value_node):
"""Complete debug specifier candidates for an Interpolation node"""
assert isinstance(value_node, ast.Interpolation)
conversion = ''
if value_node.conversion == 115: # 's'
conversion = '!s'
elif value_node.conversion == 114 and value_node.format_spec is not None:
# This is the default for debug specifiers, unless there's a format_spec
conversion = '!r'
elif value_node.conversion == 97: # 'a'
conversion = '!a'
conversion_candidates = [x + conversion for x in partial_specifier_candidates]
if value_node.format_spec is not None:
# Handle format specifications in debug specifiers
if isinstance(value_node.format_spec, ast.JoinedStr):
import python_minifier.f_string
format_specs = python_minifier.f_string.FormatSpec(value_node.format_spec, self.allowed_quotes, pep701=True).candidates()
conversion_candidates = [c + ':' + fs for c in conversion_candidates for fs in format_specs]
return [x + '}' for x in conversion_candidates]
def candidates(self):
"""Generate all possible representations"""
actual_candidates = []
# Normal t-string candidates
actual_candidates.extend(self._generate_candidates_with_processor('t', self.str_for))
# Raw t-string candidates (if we detect backslashes)
if self._contains_literal_backslashes():
actual_candidates.extend(self._generate_candidates_with_processor('rt', self.raw_str_for))
return filter(self.is_correct_ast, actual_candidates)
def _generate_candidates_with_processor(self, prefix, str_processor):
"""Generate t-string candidates using the given prefix and string processor function."""
candidates = []
for quote in self.allowed_quotes:
quote_candidates = ['']
debug_specifier_candidates = []
for v in self.node.values:
if is_constant_node(v, ast.Constant) and isinstance(v.value, str):
# String literal part - check for debug specifiers
# Could this be used as a debug specifier?
if len(quote_candidates) < 10:
import re
debug_specifier = re.match(r'.*=\s*$', v.value)
if debug_specifier:
# Maybe! Save for potential debug specifier completion
try:
debug_specifier_candidates = [x + '{' + v.value for x in quote_candidates]
except Exception:
continue
try:
quote_candidates = [x + str_processor(v.value, quote) for x in quote_candidates]
except Exception:
continue
elif isinstance(v, ast.Interpolation):
# Interpolated expression part - check for debug completion
try:
# Try debug specifier completion
completed = self.complete_debug_specifier(debug_specifier_candidates, v)
# Regular interpolation processing
interpolation_candidates = InterpolationValue(v).get_candidates()
quote_candidates = [x + y for x in quote_candidates for y in interpolation_candidates] + completed
debug_specifier_candidates = []
except Exception:
continue
else:
raise RuntimeError('Unexpected TemplateStr value: %r' % v)
candidates.extend([prefix + quote + x + quote for x in quote_candidates])
return candidates
def str_for(self, s, quote):
"""Convert string literal to properly escaped form"""
# Use MiniString for optimal string representation
# Always allowed due to PEP 701 - no backslash restrictions
mini_s = str(MiniString(s, quote)).replace('{', '{{').replace('}', '}}')
if mini_s == '':
return '\\\n'
return mini_s
def raw_str_for(self, s):
"""
Generate string representation for raw t-strings.
Don't escape backslashes like MiniString does.
"""
return s.replace('{', '{{').replace('}', '}}')
def _contains_literal_backslashes(self):
"""
Check if this t-string contains literal backslashes in constant values.
This indicates it may need to be a raw t-string.
"""
for node in ast.walk(self.node):
if is_constant_node(node, ast.Str):
if '\\' in node.s:
return True
return False
def __str__(self):
"""Generate the shortest valid t-string representation"""
if len(self.node.values) == 0:
return 't' + min(self.allowed_quotes, key=len) * 2
candidates = list(self.candidates())
# Validate all candidates
for candidate in candidates:
try:
minified_t_string = ast.parse(candidate, 'python_minifier.t_string output', mode='eval').body
except SyntaxError as syntax_error:
raise UnstableMinification(syntax_error, '', candidate)
try:
compare_ast(self.node, minified_t_string)
except CompareError as compare_error:
raise UnstableMinification(compare_error, '', candidate)
if not candidates:
raise ValueError('Unable to create representation for t-string')
return min(candidates, key=len)
class InterpolationValue(ExpressionPrinter):
"""
A Template String Interpolation Part
Handles ast.Interpolation nodes (equivalent to FormattedValue for f-strings)
"""
def __init__(self, node):
super(InterpolationValue, self).__init__()
assert isinstance(node, ast.Interpolation)
self.node = node
# Always use all quotes - no restrictions due to PEP 701
self.allowed_quotes = ['"', "'", '"""', "'''"]
self.candidates = ['']
def get_candidates(self):
"""Generate all possible representations of this interpolation"""
self.printer.delimiter('{')
if self.is_curly(self.node.value):
self.printer.delimiter(' ')
self._expression(self.node.value)
# Handle conversion specifiers
if self.node.conversion == 115: # 's'
self.printer.append('!s', TokenTypes.Delimiter)
elif self.node.conversion == 114: # 'r'
self.printer.append('!r', TokenTypes.Delimiter)
elif self.node.conversion == 97: # 'a'
self.printer.append('!a', TokenTypes.Delimiter)
# Handle format specifications
if self.node.format_spec is not None:
self.printer.delimiter(':')
# Format spec is a JoinedStr (f-string) in the AST
if isinstance(self.node.format_spec, ast.JoinedStr):
import python_minifier.f_string
# Use f-string processing for format specs
format_candidates = python_minifier.f_string.OuterFString(
self.node.format_spec, pep701=True
).candidates()
# Remove the f/rf prefix and quotes to get just the format part
format_parts = []
for fmt in format_candidates:
# Handle both f"..." and rf"..." patterns
if fmt.startswith('rf'):
# Remove rf prefix and outer quotes
inner = fmt[2:]
elif fmt.startswith('f'):
# Remove f prefix and outer quotes
inner = fmt[1:]
else:
continue
if (inner.startswith('"') and inner.endswith('"')) or \
(inner.startswith("'") and inner.endswith("'")):
format_parts.append(inner[1:-1])
elif (inner.startswith('"""') and inner.endswith('"""')) or \
(inner.startswith("'''") and inner.endswith("'''")):
format_parts.append(inner[3:-3])
else:
format_parts.append(inner)
if format_parts:
self._append(format_parts)
else:
# Simple constant format spec
self.printer.append(str(self.node.format_spec), TokenTypes.Delimiter)
self.printer.delimiter('}')
self._finalize()
return self.candidates
def is_curly(self, node):
"""Check if expression starts with curly braces (needs space)"""
if isinstance(node, (ast.SetComp, ast.DictComp, ast.Set, ast.Dict)):
return True
if isinstance(node, (ast.Expr, ast.Attribute, ast.Subscript)):
return self.is_curly(node.value)
if isinstance(node, (ast.Compare, ast.BinOp)):
return self.is_curly(node.left)
if isinstance(node, ast.Call):
return self.is_curly(node.func)
if isinstance(node, ast.BoolOp):
return self.is_curly(node.values[0])
if isinstance(node, ast.IfExp):
return self.is_curly(node.body)
return False
def visit_Constant(self, node):
"""Handle constant values in interpolations"""
if isinstance(node.value, str):
# Use Str class from f_string module for string handling
from python_minifier.f_string import Str
self.printer.append(str(Str(node.value, self.allowed_quotes, pep701=True)), TokenTypes.NonNumberLiteral)
elif isinstance(node.value, bytes):
# Use Bytes class from f_string module for bytes handling
from python_minifier.f_string import Bytes
self.printer.append(str(Bytes(node.value, self.allowed_quotes)), TokenTypes.NonNumberLiteral)
else:
# Other constants (numbers, None, etc.)
super().visit_Constant(node)
def visit_TemplateStr(self, node):
"""Handle nested t-strings"""
assert isinstance(node, ast.TemplateStr)
if self.printer.previous_token in [TokenTypes.Identifier, TokenTypes.Keyword, TokenTypes.SoftKeyword]:
self.printer.delimiter(' ')
# Nested t-string - no quote restrictions due to PEP 701
self._append(TString(node).candidates())
def visit_JoinedStr(self, node):
"""Handle nested f-strings in t-strings"""
assert isinstance(node, ast.JoinedStr)
if self.printer.previous_token in [TokenTypes.Identifier, TokenTypes.Keyword, TokenTypes.SoftKeyword]:
self.printer.delimiter(' ')
import python_minifier.f_string
# F-strings nested in t-strings also benefit from PEP 701
self._append(python_minifier.f_string.OuterFString(node, pep701=True).candidates())
def visit_Lambda(self, node):
"""Handle lambda expressions in interpolations"""
self.printer.delimiter('(')
super().visit_Lambda(node)
self.printer.delimiter(')')
def _finalize(self):
"""Finalize the current printer state"""
self.candidates = [x + str(self.printer) for x in self.candidates]
self.printer._code = ''
def _append(self, candidates):
"""Append multiple candidate strings"""
self._finalize()
self.candidates = [x + y for x in self.candidates for y in candidates]
import pytest
import ast
from python_minifier.ast_annotation import add_parent, get_parent, set_parent
def test_add_parent():
source = '''
class A:
def b(self):
pass
'''
tree = ast.parse(source)
add_parent(tree)
assert isinstance(tree, ast.Module)
assert isinstance(tree.body[0], ast.ClassDef)
assert get_parent(tree.body[0]) is tree
assert isinstance(tree.body[0].body[0], ast.FunctionDef)
assert get_parent(tree.body[0].body[0]) is tree.body[0]
assert isinstance(tree.body[0].body[0].body[0], ast.Pass)
assert get_parent(tree.body[0].body[0].body[0]) is tree.body[0].body[0]
def test_no_parent_for_root_node():
tree = ast.parse('a = 1')
add_parent(tree)
with pytest.raises(ValueError):
get_parent(tree)
def test_no_parent_for_unannotated_node():
tree = ast.parse('a = 1')
with pytest.raises(ValueError):
get_parent(tree.body[0])
def test_replaces_parent_of_given_node():
tree = ast.parse('a = func()')
add_parent(tree)
call = tree.body[0].value
tree.body[0] = call
set_parent(call, tree)
assert get_parent(call) == tree
"""Pytest configuration and fixtures for python-minifier tests."""
import os
# Set default environment variable to preserve existing test behavior
# Tests can explicitly unset this if they need to test size-based behavior
os.environ.setdefault('PYMINIFY_FORCE_BEST_EFFORT', '1')
import python_minifier.ast_compat as ast
from python_minifier.ast_annotation import add_parent
from python_minifier.rename import add_namespace, resolve_names
from python_minifier.rename.bind_names import bind_names
from python_minifier.rename.util import iter_child_namespaces
from python_minifier.util import is_constant_node
def assert_namespace_tree(source, expected_tree):
tree = ast.parse(source)
add_parent(tree)
add_namespace(tree)
bind_names(tree)
resolve_names(tree)
actual = print_namespace(tree)
print(actual)
assert actual.strip() == expected_tree.strip()
def print_namespace(namespace, indent=''):
s = ''
if not indent:
s += '\n'
def namespace_name(node):
if is_constant_node(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
return 'Function ' + node.name
elif isinstance(node, ast.ClassDef):
return 'Class ' + node.name
else:
return namespace.__class__.__name__
s += indent + '+ ' + namespace_name(namespace) + '\n'
for name in sorted(namespace.global_names):
s += indent + ' - global ' + name + '\n'
for name in sorted(namespace.nonlocal_names):
s += indent + ' - nonlocal ' + name + '\n'
for binding in sorted(namespace.bindings, key=lambda b: b.name or str(b.value)):
s += indent + ' - ' + repr(binding) + '\n'
for child in iter_child_namespaces(namespace):
s += print_namespace(child, indent=indent + ' ')
return s
"""Subprocess compatibility utilities for Python 2.7/3.x."""
import subprocess
import sys
def run_subprocess(cmd, timeout=None, input_data=None, env=None):
"""Cross-platform subprocess runner for Python 2.7+ compatibility."""
if hasattr(subprocess, 'run'):
# Python 3.5+ - encode string input to bytes for subprocess
input_bytes = input_data.encode('utf-8') if isinstance(input_data, str) else input_data
return subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
input=input_bytes, timeout=timeout, env=env)
else:
# Python 2.7, 3.3, 3.4 - no subprocess.run, no timeout support
popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
stdin=subprocess.PIPE if input_data else None, env=env)
# For Python 3.3/3.4, communicate() doesn't support timeout
# Also, Python 3.x needs bytes for stdin, Python 2.x needs str
if input_data and sys.version_info[0] >= 3 and isinstance(input_data, str):
input_data = input_data.encode('utf-8')
stdout, stderr = popen.communicate(input_data)
# Create a simple result object similar to subprocess.CompletedProcess
class Result:
def __init__(self, returncode, stdout, stderr):
self.returncode = returncode
self.stdout = stdout
self.stderr = stderr
return Result(popen.returncode, stdout, stderr)
def safe_decode(data, encoding='utf-8', errors='replace'):
"""Safe decode for Python 2.7/3.x compatibility."""
if isinstance(data, bytes):
try:
return data.decode(encoding, errors)
except UnicodeDecodeError:
return data.decode(encoding, 'replace')
return data
"""
Test for raw f-string with backslash escape sequences.
"""
import ast
import sys
import pytest
from python_minifier import unparse
from python_minifier.ast_compare import compare_ast
@pytest.mark.parametrize('source,description', [
# Raw f-string backslash tests - core regression fix
pytest.param(r'rf"{x:\\xFF}"', 'Single backslash in format spec (minimal failing case)', id='raw-fstring-backslash-format-spec'),
pytest.param(r'rf"\\n{x}\\t"', 'Backslashes in literal parts', id='raw-fstring-backslash-outer-str'),
pytest.param(r'rf"\\n{x:\\xFF}\\t"', 'Backslashes in both literal and format spec', id='raw-fstring-mixed-backslashes'),
pytest.param(r'rf"\n"', 'Single backslash in literal only', id='raw-fstring-literal-single-backslash'),
pytest.param(r'rf"\\n"', 'Double backslash in literal only', id='raw-fstring-literal-double-backslash'),
pytest.param(r'rf"{x:\xFF}"', 'Single backslash in format spec only', id='raw-fstring-formatspec-single-backslash'),
pytest.param(r'rf"{x:\\xFF}"', 'Double backslash in format spec only', id='raw-fstring-formatspec-double-backslash'),
pytest.param(r'rf"\n{x:\xFF}\t"', 'Single backslashes in both parts', id='raw-fstring-mixed-single-backslashes'),
# Special characters discovered during fuzzing
pytest.param('f"test\\x00end"', 'Null byte in literal part', id='null-byte-literal'),
pytest.param('f"{x:\\x00}"', 'Null byte in format spec', id='null-byte-format-spec'),
pytest.param('f"test\\rend"', 'Carriage return in literal (must be escaped to prevent semantic changes)', id='carriage-return-literal'),
pytest.param('f"test\\tend"', 'Tab in literal part', id='tab-literal'),
pytest.param('f"{x:\\t}"', 'Tab in format spec', id='tab-format-spec'),
pytest.param('f"test\\x01end"', 'Control character (ASCII 1)', id='control-character'),
pytest.param('f"test\\nend"', 'Newline in single-quoted string', id='newline-single-quote'),
pytest.param('f"""test\nend"""', 'Actual newline in triple-quoted string', id='newline-triple-quote'),
pytest.param('f"\\x00\\r\\t{x}"', 'Mix of null bytes, carriage returns, and tabs', id='mixed-special-chars'),
# Conversion specifiers with special characters
pytest.param(r'rf"{x!r:\\xFF}"', 'Conversion specifier !r with format spec', id='conversion-r-with-backslash'),
pytest.param(r'rf"{x!s:\\xFF}"', 'Conversion specifier !s with format spec', id='conversion-s-with-backslash'),
pytest.param(r'rf"{x!a:\\xFF}"', 'Conversion specifier !a with format spec', id='conversion-a-with-backslash'),
pytest.param('f"{x!r:\\x00}"', 'Conversion specifier with null byte in format spec', id='conversion-with-null-byte'),
# Other edge cases
pytest.param(r'rf"""{x:\\xFF}"""', 'Triple-quoted raw f-string with backslashes', id='raw-fstring-triple-quoted'),
pytest.param(r'rf"{x:\\xFF}{y:\\xFF}"', 'Multiple interpolations with backslashes', id='raw-fstring-multiple-interpolations'),
pytest.param('f"\\\\n{x}\\\\t"', 'Regular (non-raw) f-string with backslashes', id='regular-fstring-with-backslash'),
])
@pytest.mark.skipif(sys.version_info < (3, 6), reason='F-strings not supported in Python < 3.6')
def test_fstring_edge_cases(source, description):
"""Test f-strings with various edge cases including backslashes and special characters."""
expected_ast = ast.parse(source)
actual_code = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(actual_code))
@pytest.mark.parametrize('source,description', [
pytest.param(r'f"{f"\\n{x}\\t"}"', 'Nested f-strings with backslashes in inner string parts', id='nested-fstring-backslashes'),
pytest.param(r'f"{rf"\\xFF{y}\\n"}"', 'Nested raw f-strings with backslashes', id='nested-raw-fstring-backslashes'),
pytest.param(r'f"{f"{x:\\xFF}"}"', 'Nested f-strings with backslashes in format specs', id='nested-fstring-format-spec-backslashes'),
])
@pytest.mark.skipif(sys.version_info < (3, 12), reason='Nested f-strings not supported in Python < 3.12')
def test_nested_fstring_edge_cases(source, description):
"""Test nested f-strings with backslashes (Python 3.12+ only)."""
expected_ast = ast.parse(source)
actual_code = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(actual_code))
@pytest.mark.skipif(sys.version_info < (3, 6), reason='F-strings not supported in Python < 3.6')
def test_fstring_carriage_return_format_spec():
r"""Test f-string with carriage return in format spec.
Note: This is syntactically valid but will fail at runtime with
ValueError: Unknown format code '\xd' for object of type 'int'
However, the minifier correctly escapes the carriage return to prevent
Python from normalizing it to a newline during parsing.
"""
source = 'f"{x:\\r}"'
expected_ast = ast.parse(source)
actual_code = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(actual_code))
"""
Test for raw t-string with backslash escape sequences.
This test covers the same scenarios as test_raw_fstring_backslash.py but for t-strings.
Since t-strings were introduced in Python 3.14 and the raw f-string regression was fixed
in Python 3.14rc2, these tests verify that raw t-strings handle backslashes correctly
from the start, especially in format specs.
"""
import ast
import sys
import pytest
from python_minifier import unparse
from python_minifier.ast_compare import compare_ast
@pytest.mark.parametrize('source,description', [
# Raw t-string backslash tests - core regression testing
pytest.param(r'rt"{x:\\xFF}"', 'Single backslash in format spec (minimal case)', id='raw-tstring-backslash-format-spec'),
pytest.param(r'rt"\\n{x}\\t"', 'Backslashes in literal parts', id='raw-tstring-backslash-outer-str'),
pytest.param(r'rt"\\n{x:\\xFF}\\t"', 'Backslashes in both literal and format spec', id='raw-tstring-mixed-backslashes'),
pytest.param(r'rt"\n"', 'Single backslash in literal only', id='raw-tstring-literal-single-backslash'),
pytest.param(r'rt"\\n"', 'Double backslash in literal only', id='raw-tstring-literal-double-backslash'),
pytest.param(r'rt"{x:\xFF}"', 'Single backslash in format spec only', id='raw-tstring-formatspec-single-backslash'),
pytest.param(r'rt"{x:\\xFF}"', 'Double backslash in format spec only', id='raw-tstring-formatspec-double-backslash'),
pytest.param(r'rt"\n{x:\xFF}\t"', 'Single backslashes in both parts', id='raw-tstring-mixed-single-backslashes'),
# Special characters discovered during fuzzing
pytest.param('t"test\\x00end"', 'Null byte in literal part', id='null-byte-literal'),
pytest.param('t"{x:\\x00}"', 'Null byte in format spec', id='null-byte-format-spec'),
pytest.param('t"test\\rend"', 'Carriage return in literal (must be escaped to prevent semantic changes)', id='carriage-return-literal'),
pytest.param('t"test\\tend"', 'Tab in literal part', id='tab-literal'),
pytest.param('t"{x:\\t}"', 'Tab in format spec', id='tab-format-spec'),
pytest.param('t"test\\x01end"', 'Control character (ASCII 1)', id='control-character'),
pytest.param('t"test\\nend"', 'Newline in single-quoted string', id='newline-single-quote'),
pytest.param('t"""test\nend"""', 'Actual newline in triple-quoted string', id='newline-triple-quote'),
pytest.param('t"\\x00\\r\\t{x}"', 'Mix of null bytes, carriage returns, and tabs', id='mixed-special-chars'),
# Conversion specifiers with special characters
pytest.param(r'rt"{x!r:\\xFF}"', 'Conversion specifier !r with format spec', id='conversion-r-with-backslash'),
pytest.param(r'rt"{x!s:\\xFF}"', 'Conversion specifier !s with format spec', id='conversion-s-with-backslash'),
pytest.param(r'rt"{x!a:\\xFF}"', 'Conversion specifier !a with format spec', id='conversion-a-with-backslash'),
pytest.param('t"{x!r:\\x00}"', 'Conversion specifier with null byte in format spec', id='conversion-with-null-byte'),
# Other edge cases
pytest.param(r'rt"""{x:\\xFF}"""', 'Triple-quoted raw t-string with backslashes', id='raw-tstring-triple-quoted'),
pytest.param(r'rt"{x:\\xFF}{y:\\xFF}"', 'Multiple interpolations with backslashes', id='raw-tstring-multiple-interpolations'),
pytest.param('t"\\\\n{x}\\\\t"', 'Regular (non-raw) t-string with backslashes', id='regular-tstring-with-backslash'),
# Complex format specs - originally in test_raw_tstring_complex_format_specs
pytest.param(r'rt"{x:\\xFF\\n}"', 'Multiple backslashes in single format spec', id='complex-multiple-backslashes'),
pytest.param(r'rt"{x:\\xFF}{y:\\n}"', 'Multiple format specs with backslashes', id='complex-multiple-format-specs'),
pytest.param(r'rt"\\start{x:\\xFF}\\end"', 'Backslashes in both literal and format spec parts', id='complex-mixed-locations'),
pytest.param(r'rt"{x:{fmt:\\n}}"', 'Nested format spec with backslashes', id='complex-nested-format-spec'),
# Unicode escapes - originally in test_raw_tstring_unicode_escapes
pytest.param(r'rt"{x:\u0041}"', 'Unicode escape in format spec', id='unicode-short-escape'),
pytest.param(r'rt"{x:\U00000041}"', 'Long Unicode escape in format spec', id='unicode-long-escape'),
pytest.param(r'rt"\\u0041{x:\xFF}"', 'Unicode in literal, hex in format spec', id='unicode-mixed'),
# Mixed t-string and f-string
pytest.param(r'rt"t-string \\n {f"f-string {x:\\xFF}"} \\t"', 'Nested combination of raw t-strings and f-strings', id='mixed-tstring-fstring'),
])
@pytest.mark.skipif(sys.version_info < (3, 14), reason='T-strings not supported in Python < 3.14')
def test_tstring_edge_cases(source, description):
"""Test t-strings with various edge cases including backslashes and special characters."""
expected_ast = ast.parse(source)
actual_code = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(actual_code))
@pytest.mark.parametrize('source,description', [
pytest.param(r't"{t"\\n{x}\\t"}"', 'Nested t-strings with backslashes in inner string parts', id='nested-tstring-backslashes'),
pytest.param(r't"{rt"\\xFF{y}\\n"}"', 'Nested raw t-strings with backslashes', id='nested-raw-tstring-backslashes'),
pytest.param(r't"{t"{x:\\xFF}"}"', 'Nested t-strings with backslashes in format specs', id='nested-tstring-format-spec-backslashes'),
])
@pytest.mark.skipif(sys.version_info < (3, 14), reason='T-strings not supported in Python < 3.14')
def test_nested_tstring_edge_cases(source, description):
"""Test nested t-strings with backslashes."""
expected_ast = ast.parse(source)
actual_code = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(actual_code))
@pytest.mark.skipif(sys.version_info < (3, 14), reason='T-strings not supported in Python < 3.14')
def test_tstring_carriage_return_format_spec():
r"""Test t-string with carriage return in format spec.
Note: This is syntactically valid but will fail at runtime with
ValueError: Unknown format code '\xd' for object of type 'int'
However, unlike f-strings, t-strings can successfully unparse this case.
"""
source = 't"{x:\\r}"'
expected_ast = ast.parse(source)
actual_code = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(actual_code))
import ast
import sys
import pytest
from python_minifier import unparse
from python_minifier.ast_compare import compare_ast
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
@pytest.mark.parametrize(
'statement', [
't"hello"',
't"Hello {name}"',
't"Hello {name!r}"',
't"Hello {name!s}"',
't"Hello {name!a}"',
't"Value: {value:.2f}"',
't"{1}"',
't"{1=}"',
't"{1=!r:.4}"',
't"{1=:.4}"',
't"{1=!s:.4}"',
't"{1=!a}"',
]
)
def test_tstring_basic(statement):
"""Test basic t-string parsing and unparsing"""
assert unparse(ast.parse(statement)) == statement
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
@pytest.mark.parametrize(
'statement', [
't"Hello {name} and {other}"',
't"User {action}: {amount:.2f} {item}"',
't"{value:.{precision}f}"',
't"Complex {a} and {b!r} with {c:.3f}"',
]
)
def test_tstring_multiple_interpolations(statement):
"""Test t-strings with multiple interpolations"""
assert unparse(ast.parse(statement)) == statement
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
@pytest.mark.parametrize(
'statement', [
't"nested {t"inner {x}"}"',
't"outer {t"middle {t"inner {y}"}"} end"',
't"complex {t"nested {value:.2f}"} result"',
't"{t"prefix {name}"} suffix"',
]
)
def test_tstring_nesting(statement):
"""Test nested t-strings (should work with PEP 701 benefits)"""
assert unparse(ast.parse(statement)) == statement
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_quote_variations():
"""Test different quote styles for t-strings - just ensure they parse and unparse correctly"""
statements = [
"t'single quotes {name}'",
't"""triple double quotes {name}"""',
"t'''triple single quotes {name}'''",
't"mixed {name} with \\"escaped\\" quotes"',
"t'mixed {name} with \\'escaped\\' quotes'",
]
for statement in statements:
# Just test that it parses and round-trips correctly, don't care about exact quote style
parsed = ast.parse(statement)
unparsed = unparse(parsed)
reparsed = ast.parse(unparsed)
compare_ast(parsed, reparsed)
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_multiline():
"""Test multiline t-strings"""
statement = '''t"""
Multiline template
with {name} interpolation
and {value:.2f} formatting
"""'''
expected = 't"\\nMultiline template\\nwith {name} interpolation\\nand {value:.2f} formatting\\n"'
assert unparse(ast.parse(statement)) == expected
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_with_complex_expressions():
"""Test t-strings with complex expressions in interpolations"""
test_cases = [
('t"Result: {func(a, b, c)}"', 't"Result: {func(a,b,c)}"'),
('t"List: {[x for x in items]}"', 't"List: {[x for x in items]}"'),
('t"Dict: {key: value for key, value in pairs}"', 't"Dict: {key: value for key, value in pairs}"'),
('t"Set: {{item for item in collection}}"', 't"Set: {{item for item in collection}}"'),
('t"Lambda: {(lambda x: x * 2)(value)}"', 't"Lambda: {((lambda x:x*2))(value)}"'),
('t"Attribute: {obj.attr.method()}"', 't"Attribute: {obj.attr.method()}"'),
('t"Subscription: {data[key][0]}"', 't"Subscription: {data[key][0]}"'),
('t"Ternary: {x if condition else y}"', 't"Ternary: {x if condition else y}"'),
]
for input_statement, expected_output in test_cases:
assert unparse(ast.parse(input_statement)) == expected_output
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_with_binary_operations():
"""Test t-strings with binary operations in interpolations"""
test_cases = [
('t"Sum: {a + b}"', 't"Sum: {a+b}"'),
('t"Product: {x * y}"', 't"Product: {x*y}"'),
('t"Division: {total / count}"', 't"Division: {total/count}"'),
('t"Complex: {(a + b) * (c - d)}"', 't"Complex: {(a+b)*(c-d)}"'),
('t"String concat: {first + last}"', 't"String concat: {first+last}"'),
('t"Comparison: {x > y}"', 't"Comparison: {x>y}"'),
('t"Boolean: {a and b}"', 't"Boolean: {a and b}"'),
('t"Bitwise: {x | y}"', 't"Bitwise: {x|y}"'),
]
for input_statement, expected_output in test_cases:
assert unparse(ast.parse(input_statement)) == expected_output
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_empty():
"""Test empty t-string"""
statement = 't""'
assert unparse(ast.parse(statement)) == statement
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_only_interpolation():
"""Test t-string with only interpolation, no literal parts"""
statement = 't"{value}"'
assert unparse(ast.parse(statement)) == statement
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_special_characters():
"""Test t-strings with special characters that need escaping"""
statements = [
't"Braces: {{literal}} and {variable}"',
't"Newline: \\n and {value}"',
't"Tab: \\t and {value}"',
't"Quote: \\" and {value}"',
"t'Quote: \\' and {value}'",
't"Backslash: \\\\ and {value}"',
]
for statement in statements:
# Test that it parses and round-trips correctly
parsed = ast.parse(statement)
unparsed = unparse(parsed)
reparsed = ast.parse(unparsed)
compare_ast(parsed, reparsed)
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_ast_structure():
"""Test that t-string AST structure is correctly preserved"""
source = 't"Hello {name} world {value:.2f}!"'
expected_ast = ast.parse(source) # Parse as module, not expression
actual_ast = ast.parse(unparse(expected_ast))
compare_ast(expected_ast, actual_ast)
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_vs_fstring_syntax():
"""Test that t-strings and f-strings have similar but distinct syntax"""
# These should both parse successfully but produce different ASTs
tstring = 't"Hello {name}"'
fstring = 'f"Hello {name}"'
t_ast = ast.parse(tstring)
f_ast = ast.parse(fstring)
# Should be different node types in the expression
assert type(t_ast.body[0].value).__name__ == 'TemplateStr'
assert type(f_ast.body[0].value).__name__ == 'JoinedStr'
# But should unparse correctly
assert unparse(t_ast) == tstring
assert unparse(f_ast) == fstring
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_raw_template_strings():
"""Test raw template strings (rt prefix) - they parse but unparser loses the raw prefix"""
if sys.version_info >= (3, 14): # Raw t-strings are supported in Python 3.14
# Test that raw t-strings parse correctly
raw_statements = [
'rt"raw template {name}"',
'rt"backslash \\\\ preserved {name}"',
]
for statement in raw_statements:
# Raw t-strings should parse successfully
ast.parse(statement)
# Test that raw behavior is preserved in the AST even if prefix is lost
raw_backslash = 'rt"backslash \\\\n and {name}"'
regular_backslash = 't"backslash \\n and {name}"' # Only two backslashes for regular
raw_ast = ast.parse(raw_backslash)
regular_ast = ast.parse(regular_backslash)
# The AST should show different string content
raw_content = raw_ast.body[0].value.values[0].value
regular_content = regular_ast.body[0].value.values[0].value
# Raw should have literal backslash-n, regular should have actual newline
assert '\\\\n' in raw_content # literal backslash-n (two chars: \ and n)
assert '\n' in regular_content # actual newline character
assert raw_content != regular_content
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_debug_specifier_limitations():
"""Test debug specifier limitations (same as f-strings)"""
# Debug specifiers work when at the start of the string
assert unparse(ast.parse('t"{name=}"')) == 't"{name=}"'
assert unparse(ast.parse('t"{value=:.2f}"')) == 't"{value=:.2f}"'
# But are lost when there's a preceding literal (same limitation as f-strings)
assert unparse(ast.parse('t"Hello {name=}"')) == 't"Hello name={name!r}"'
assert unparse(ast.parse('t"Hello {name=!s}"')) == 't"Hello name={name!s}"'
assert unparse(ast.parse('t"Hello {name=:.2f}"')) == 't"Hello name={name:.2f}"'
# This matches f-string behavior exactly
assert unparse(ast.parse('f"Hello {name=}"')) == 'f"Hello name={name!r}"'
@pytest.mark.skipif(sys.version_info < (3, 14), reason="Template strings require Python 3.14+")
def test_tstring_error_conditions():
"""Test that our implementation handles edge cases properly"""
# Test round-trip parsing for complex cases
complex_cases = [
't"Deep {t"nesting {t"level {x}"}"} works"',
't"Format {value:{width}.{precision}f} complex"',
't"Mixed {a!r} and {b=:.2f} specifiers"',
]
for case in complex_cases:
try:
# Parse as module, not expression
expected_ast = ast.parse(case)
unparsed = unparse(expected_ast)
actual_ast = ast.parse(unparsed)
compare_ast(expected_ast, actual_ast)
except Exception as e:
pytest.fail("Failed to handle complex case {}: {}".format(case, e))
import ast
from python_minifier import unparse
from python_minifier.ast_compare import compare_ast
def test_single_element_tuple_in_with():
"""Test that single-element tuples in with statements are preserved during minification."""
source = 'with(None,):pass'
expected_ast = ast.parse(source)
minified = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(minified))
def test_tuple_with_multiple_elements():
"""Test that multi-element tuples in with statements work correctly."""
source = 'with(a,b):pass'
expected_ast = ast.parse(source)
minified = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(minified))
def test_nested_tuple_with():
"""Test nested tuple structures in with statements."""
source = 'with((a,),b):pass'
expected_ast = ast.parse(source)
minified = unparse(expected_ast)
compare_ast(expected_ast, ast.parse(minified))
+4
-3
Metadata-Version: 2.4
Name: python_minifier
Version: 3.0.0
Version: 3.1.0
Summary: Transform Python source code into it's most compact representation

@@ -28,2 +28,3 @@ Home-page: https://github.com/dflook/python-minifier

Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 2

@@ -35,3 +36,3 @@ Classifier: Programming Language :: Python :: 2.7

Classifier: Topic :: Software Development
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, <3.14
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, <3.15
Description-Content-Type: text/markdown

@@ -58,3 +59,3 @@ License-File: LICENSE

python-minifier currently supports Python 2.7 and Python 3.3 to 3.13. Previous releases supported Python 2.6.
python-minifier currently supports Python 2.7 and Python 3.3 to 3.14. Previous releases supported Python 2.6.

@@ -61,0 +62,0 @@ * [PyPI](https://pypi.org/project/python-minifier/)

@@ -7,3 +7,3 @@ # Python Minifier

python-minifier currently supports Python 2.7 and Python 3.3 to 3.13. Previous releases supported Python 2.6.
python-minifier currently supports Python 2.7 and Python 3.3 to 3.14. Previous releases supported Python 2.6.

@@ -10,0 +10,0 @@ * [PyPI](https://pypi.org/project/python-minifier/)

@@ -23,3 +23,2 @@ import os.path

package_dir={'': 'src'},

@@ -31,4 +30,4 @@ packages=find_packages('src'),

python_requires='>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, <3.14',
version='3.0.0',
python_requires='>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, <3.15',
version='3.1.0',

@@ -51,2 +50,3 @@ classifiers=[

'Programming Language :: Python :: 3.13',
'Programming Language :: Python :: 3.14',
'Programming Language :: Python :: 2',

@@ -53,0 +53,0 @@ 'Programming Language :: Python :: 2.7',

Metadata-Version: 2.4
Name: python_minifier
Version: 3.0.0
Version: 3.1.0
Summary: Transform Python source code into it's most compact representation

@@ -28,2 +28,3 @@ Home-page: https://github.com/dflook/python-minifier

Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 2

@@ -35,3 +36,3 @@ Classifier: Programming Language :: Python :: 2.7

Classifier: Topic :: Software Development
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, <3.14
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, <3.15
Description-Content-Type: text/markdown

@@ -58,3 +59,3 @@ License-File: LICENSE

python-minifier currently supports Python 2.7 and Python 3.3 to 3.13. Previous releases supported Python 2.6.
python-minifier currently supports Python 2.7 and Python 3.3 to 3.14. Previous releases supported Python 2.6.

@@ -61,0 +62,0 @@ * [PyPI](https://pypi.org/project/python-minifier/)

LICENSE
MANIFEST.in
README.md

@@ -15,2 +16,3 @@ setup.py

src/python_minifier/py.typed
src/python_minifier/t_string.py
src/python_minifier/token_printer.py

@@ -25,2 +27,3 @@ src/python_minifier/util.py

src/python_minifier/ast_annotation/__init__.py
src/python_minifier/rename/README.md
src/python_minifier/rename/__init__.py

@@ -50,2 +53,6 @@ src/python_minifier/rename/bind_names.py

src/python_minifier/transforms/suite_transformer.py
test/conftest.py
test/helpers.py
test/requirements.txt
test/subprocess_compat.py
test/test_assignment_expressions.py

@@ -77,2 +84,4 @@ test/test_await_fstring.py

test/test_preserve_shebang.py
test/test_raw_fstring_backslash.py
test/test_raw_tstring_backslash.py
test/test_remove_annotations.py

@@ -89,4 +98,7 @@ test/test_remove_assert.py

test/test_slice.py
test/test_template_strings.py
test/test_tuple_with_bug.py
test/test_type_param_defaults.py
test/test_unicode_cli.py
test/test_utf8_encoding.py
test/test_utf8_encoding.py
test/ast_annotation/test_add_parent.py

@@ -62,6 +62,9 @@ import python_minifier.ast_compat as ast

for field in set(l_ast._fields + r_ast._fields):
for field in sorted(set(l_ast._fields + r_ast._fields)):
if field == 'kind' and isinstance(l_ast, ast.Constant):
continue
if field == 'str' and hasattr(ast, 'Interpolation') and isinstance(l_ast, ast.Interpolation):
continue

@@ -68,0 +71,0 @@ if isinstance(getattr(l_ast, field, None), list):

@@ -73,2 +73,4 @@ """

'TypeVarTuple',
'TemplateStr',
'Interpolation',
'YieldFrom',

@@ -75,0 +77,0 @@ 'arg',

@@ -746,2 +746,9 @@ import sys

def visit_TemplateStr(self, node):
assert isinstance(node, ast.TemplateStr)
import python_minifier.t_string
self.printer.tstring(str(python_minifier.t_string.TString(node)))
def visit_NamedExpr(self, node):

@@ -748,0 +755,0 @@ self._expression(node.target)

@@ -11,2 +11,3 @@ """

import re
import sys

@@ -62,7 +63,8 @@ import python_minifier.ast_compat as ast

def candidates(self):
actual_candidates = []
def _generate_candidates_with_processor(self, prefix, str_processor):
"""Generate f-string candidates using the given prefix and string processor function."""
candidates = []
for quote in self.allowed_quotes:
candidates = ['']
quote_candidates = ['']
debug_specifier_candidates = []

@@ -76,10 +78,8 @@ nested_allowed = copy.copy(self.allowed_quotes)

if is_constant_node(v, ast.Str):
# Could this be used as a debug specifier?
if len(candidates) < 10:
if len(quote_candidates) < 10:
debug_specifier = re.match(r'.*=\s*$', v.s)
if debug_specifier:
# Maybe!
try:
debug_specifier_candidates = [x + '{' + v.s for x in candidates]
debug_specifier_candidates = [x + '{' + v.s for x in quote_candidates]
except Exception:

@@ -89,3 +89,3 @@ continue

try:
candidates = [x + self.str_for(v.s, quote) for x in candidates]
quote_candidates = [x + str_processor(v.s, quote) for x in quote_candidates]
except Exception:

@@ -96,4 +96,4 @@ continue

completed = self.complete_debug_specifier(debug_specifier_candidates, v)
candidates = [
x + y for x in candidates for y in FormattedValue(v, nested_allowed, self.pep701).get_candidates()
quote_candidates = [
x + y for x in quote_candidates for y in FormattedValue(v, nested_allowed, self.pep701).get_candidates()
] + completed

@@ -106,10 +106,67 @@ debug_specifier_candidates = []

actual_candidates += ['f' + quote + x + quote for x in candidates]
candidates += [prefix + quote + x + quote for x in quote_candidates]
return candidates
def candidates(self):
actual_candidates = []
# Normal f-string candidates
actual_candidates += self._generate_candidates_with_processor('f', self.str_for)
# Raw f-string candidates (if we detect backslashes)
if self._contains_literal_backslashes():
actual_candidates += self._generate_candidates_with_processor('rf', lambda s, quote: self.raw_str_for(s))
return filter(self.is_correct_ast, actual_candidates)
def str_for(self, s, quote):
def raw_str_for(self, s):
"""
Generate string representation for raw f-strings.
Don't escape backslashes like MiniString does.
"""
return s.replace('{', '{{').replace('}', '}}')
def _contains_literal_backslashes(self):
"""
Check if this f-string contains literal backslashes in constant values.
This indicates it may need to be a raw f-string.
"""
for node in ast.walk(self.node):
if is_constant_node(node, ast.Str):
if '\\' in node.s:
return True
return False
def str_for(self, s, quote):
# Escape null bytes and other characters that can't appear in Python source
escaped = ''
is_multiline = len(quote) == 3 # Triple-quoted strings
for c in s:
if c == '\0':
escaped += '\\x00'
elif c == '\n' and not is_multiline:
# Only escape newlines in single-quoted strings
escaped += '\\n'
elif c == '\r':
# Always escape carriage returns because Python normalizes them during parsing
# This prevents semantic changes (\\r -> \\n) in multiline strings
escaped += '\\r'
elif c == '\t':
# Always escape tabs for consistency (though not strictly necessary in multiline)
escaped += '\\t'
elif c == '{':
escaped += '{{'
elif c == '}':
escaped += '}}'
elif ord(c) < 32 and c not in '\n\r\t':
# Escape other control characters
escaped += '\\x{:02x}'.format(ord(c))
else:
escaped += c
return escaped
class OuterFString(FString):

@@ -294,3 +351,5 @@ """

if c == '\n':
if c == '\0':
literal += '\\x00'
elif c == '\n':
literal += '\\n'

@@ -312,3 +371,3 @@ elif c == '\r':

if '\0' in self._s or ('\\' in self._s and not self.pep701):
if '\\' in self._s and not self.pep701:
raise ValueError('Impossible to represent a character in f-string expression part')

@@ -371,5 +430,33 @@

def str_for(self, s):
return s.replace('{', '{{').replace('}', '}}')
# Special handling for problematic format spec characters that can cause parsing issues
# If the format spec contains only braces, it's likely an invalid test case
# Escape null bytes and other unprintable characters
escaped = ''
for c in s:
if c == '\0':
escaped += '\\x00'
elif c == '{':
escaped += '{{'
elif c == '}':
escaped += '}}'
elif c == '\\':
# For Python 3.12+ raw f-string regression (fixed in 3.14rc2), we need to escape backslashes
# in format specs so they round-trip correctly
if (3, 12) <= sys.version_info < (3, 14):
escaped += '\\\\'
else:
escaped += c
elif c == '\r':
# Always escape carriage returns because Python normalizes them to newlines during parsing
# This prevents AST mismatches (\r -> \n normalization)
escaped += '\\r'
elif ord(c) < 32 and c not in '\t\n':
# Escape other control characters except tab, newline
escaped += '\\x{:02x}'.format(ord(c))
else:
escaped += c
return escaped
class Bytes(object):

@@ -423,4 +510,21 @@ """

literal = 'b' + self.current_quote
literal += chr(b)
# Handle special characters that need escaping
if b == 0: # null byte
literal += '\\x00'
elif b == ord('\\'): # backslash
literal += '\\\\'
elif b == ord('\n'): # newline
literal += '\\n'
elif b == ord('\r'): # carriage return
literal += '\\r'
elif b == ord('\t'): # tab
literal += '\\t'
elif len(self.current_quote) == 1 and b == ord(self.current_quote): # single quote character
literal += '\\' + self.current_quote
elif 32 <= b <= 126: # printable ASCII
literal += chr(b)
else: # other non-printable characters
literal += '\\x{:02x}'.format(b)
if literal:

@@ -434,4 +538,2 @@ literal += self.current_quote

if b'\0' in self._b or b'\\' in self._b:
raise ValueError('Impossible to represent a %r character in f-string expression part')

@@ -438,0 +540,0 @@ if b'\n' in self._b or b'\r' in self._b:

@@ -223,2 +223,9 @@ import python_minifier.ast_compat as ast

def visit_TemplateStr(self, node):
for v in node.values:
if is_constant_node(v, ast.Str):
# Can't hoist string literals that are part of the template
continue
self.visit(v)
def visit_NameConstant(self, node):

@@ -225,0 +232,0 @@ self.get_binding(node.value, node).add_reference(node)

@@ -184,2 +184,12 @@ """Tools for assembling python code from tokens."""

def tstring(self, s):
"""Add a template string (t-string) to the output code."""
assert isinstance(s, str)
if self.previous_token in [TokenTypes.Identifier, TokenTypes.Keyword, TokenTypes.SoftKeyword]:
self.delimiter(' ')
self._code += s
self.previous_token = TokenTypes.NonNumberLiteral
def delimiter(self, d):

@@ -186,0 +196,0 @@ """Add a delimiter to the output code."""

@@ -201,3 +201,3 @@ import ast

from __future__ import print_function
from __future__ import sausages
from __future__ import unicode_literals
import collections

@@ -211,3 +211,3 @@ A = b'Hello'

from __future__ import print_function
from __future__ import sausages
from __future__ import unicode_literals
C = b'Hello'

@@ -214,0 +214,0 @@ import collections

@@ -16,2 +16,4 @@ import sys

def test_type_nodes():
if sys.version_info >= (3, 14):
pytest.skip('Deprecated AST types removed in Python 3.14')
assert is_constant_node(ast.Str('a'), ast.Str)

@@ -46,2 +48,4 @@

pytest.skip('Constant not available')
if sys.version_info >= (3, 14):
pytest.skip('Deprecated AST types removed in Python 3.14')

@@ -56,1 +60,37 @@ assert is_constant_node(ast.Constant('a'), ast.Str)

assert is_constant_node(ast.Constant(ast.literal_eval('...')), ast.Ellipsis)
def test_ast_compat_types_python314():
"""Test that ast_compat provides the removed AST types in Python 3.14+"""
if sys.version_info < (3, 14):
pytest.skip('ast_compat types test only for Python 3.14+')
import python_minifier.ast_compat as ast_compat
# Test that ast_compat provides the removed types
assert is_constant_node(ast_compat.Str('a'), ast_compat.Str)
assert is_constant_node(ast_compat.Bytes(b'a'), ast_compat.Bytes)
assert is_constant_node(ast_compat.Num(1), ast_compat.Num)
assert is_constant_node(ast_compat.Num(0), ast_compat.Num)
assert is_constant_node(ast_compat.NameConstant(True), ast_compat.NameConstant)
assert is_constant_node(ast_compat.NameConstant(False), ast_compat.NameConstant)
assert is_constant_node(ast_compat.NameConstant(None), ast_compat.NameConstant)
assert is_constant_node(ast_compat.Ellipsis(), ast_compat.Ellipsis)
def test_ast_compat_constant_nodes_python314():
"""Test that ast_compat works with Constant nodes in Python 3.14+"""
if sys.version_info < (3, 14):
pytest.skip('ast_compat constant test only for Python 3.14+')
import python_minifier.ast_compat as ast_compat
# Test that Constant nodes work with ast_compat types
assert is_constant_node(ast.Constant('a'), ast_compat.Str)
assert is_constant_node(ast.Constant(b'a'), ast_compat.Bytes)
assert is_constant_node(ast.Constant(1), ast_compat.Num)
assert is_constant_node(ast.Constant(0), ast_compat.Num)
assert is_constant_node(ast.Constant(True), ast_compat.NameConstant)
assert is_constant_node(ast.Constant(False), ast_compat.NameConstant)
assert is_constant_node(ast.Constant(None), ast_compat.NameConstant)
assert is_constant_node(ast.Constant(ast.literal_eval('...')), ast_compat.Ellipsis)

@@ -33,4 +33,8 @@ # -*- coding: utf-8 -*-

# Verify the output file was created and contains Unicode characters
with codecs.open(output_path, 'r', encoding='utf-8') as f:
minified_content = f.read()
if sys.version_info[0] >= 3:
with open(output_path, 'r', encoding='utf-8') as f:
minified_content = f.read()
else:
with codecs.open(output_path, 'r', encoding='utf-8') as f:
minified_content = f.read()

@@ -92,4 +96,8 @@ # Verify problematic Unicode characters are preserved

with codecs.open(temp_file.name, 'r', encoding='utf-8') as f:
content = f.read()
if sys.version_info[0] >= 3:
with open(temp_file.name, 'r', encoding='utf-8') as f:
content = f.read()
else:
with codecs.open(temp_file.name, 'r', encoding='utf-8') as f:
content = f.read()

@@ -96,0 +104,0 @@ if hasattr(sys, 'pypy_version_info') and sys.version_info[0] >= 3:

# -*- coding: utf-8 -*-
import pytest
import python_minifier

@@ -7,4 +6,4 @@ import tempfile

import codecs
import sys
def test_minify_utf8_file():

@@ -56,4 +55,8 @@ """Test minifying a Python file with UTF-8 characters not in Windows default encoding."""

# Python 2.7 doesn't support encoding parameter in open()
with codecs.open(temp_file, 'r', encoding='utf-8') as f:
original_content = f.read()
if sys.version_info[0] >= 3:
with open(temp_file, 'r', encoding='utf-8') as f:
original_content = f.read()
else:
with codecs.open(temp_file, 'r', encoding='utf-8') as f:
original_content = f.read()

@@ -68,8 +71,8 @@ # This should work - minify the UTF-8 content

exec(minified, minified_globals)
# The minified code should contain the same functions that return Unicode
assert 'greet_in_greek' in minified_globals
assert u"Γεια σας κόσμος" == minified_globals['greet_in_greek']()
# Test that mathematical symbols are also preserved
# Test that mathematical symbols are also preserved
assert 'mathematical_formula' in minified_globals

@@ -107,7 +110,7 @@ assert u"∑ from i=1 to ∞" in minified_globals['mathematical_formula']()

exec(minified, minified_globals)
# Test that the functions return the correct Unicode strings
assert u"🐍" in minified_globals['emoji_function']()
assert u"∆" in minified_globals['emoji_function']()
# Test the class

@@ -114,0 +117,0 @@ unicode_obj = minified_globals['UnicodeClass']()