Socket
Socket
Sign inDemoInstall

xml_normalize

Package Overview
Dependencies
2
Maintainers
1
Versions
16
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

xml_normalize

Normalizes xml files. Options include sorting siblings, remove nodes, whitespace normalization and pretty print.


Version published
Weekly downloads
7K
increased by24.24%
Maintainers
1
Created
Weekly downloads
 

Readme

Source

npm Coverage Status Language grade: JavaScript

XML Normalize

This program allows normalizing arbitrary xml files. Normalization can be configured:

  • sort sibling elements based on some attribute value
  • remove unwanted nodes
  • trim texts
  • normalize whitespaces/line breaks

This can be used as a post-/pre-processing step to keep diffs small for generated xml files.

Usage

Either install via npm i -g xml_normalize or run directly with npx xml_normalize.

Usage: npx xml_normalize [options]

Options:
  -i, --input-file <inputFile>       input file
  -o, --output-file <outputFile>     output file - if not provided result is printed to stdout
  -r, --remove-path <removePath...>  simple XPath(s) to remove elements - e.g. "/html/head[1]/script"
  -s, --sort-path <sortPath>         simple XPath that references an attribute to sort - e.g. "/html/head[1]/script/@src"
  --no-pretty                        Disable pretty format output
  --no-trim                          Disable trimming of whitespace at the beginning and end of text nodes (trims only pure text nodes)
  --no-attribute-trim                Disable trimming whitespace at the beginning and end of attribute values
  -tf, --trim-force                  Trim the whitespace at the beginning and end of text nodes (trims as well text adjacent to nested nodes)
  -n, --normalize-whitespace         Normalize whitespaces inside text nodes and attribute values
  -d, --debug                        enable debug output
  -h, --help                         display help for command

Options and Examples

Sorting

Allows to sort siblings at a specific path with the same tag name lexicographically based on a specific attribute value.

Example:

<root>
  <node>
    <child id="z">should be last</child>
    <child id="a">should be first</child>
  </node>
  <node>
    <child id="y">should be last</child>
    <child id="b">should be first</child>
  </node>
</root>

npx xml_normalize -s /root/node/child/@id will create:

<root>
  <node>
    <child id="a">should be first</child>
    <child id="z">should be last</child>
  </node>
  <node>
    <child id="b">should be first</child>
    <child id="y">should be last</child>
  </node>
</root>

Removing

Allows to remove nodes in a specific path.

Example:

<root>
  <node>
    <child id="z">should be removed</child>
    <child id="a">should be removed</child>
  </node>
  <node>
    <child id="y">should stay</child>
    <child id="b">should stay</child>
  </node>
</root>

npx xml_normalize -r /root/node[1]/child will create:

<root>
  <node/>
  <node>
    <child id="b">should stay</child>
    <child id="y">should stay</child>
  </node>
</root>

npx xml_normalize -r /root/node/child instead, will create:

<root>
  <node/>
  <node/>
</root>

Normalize whitespace

This option replaces any number of consecutive whitespace, tab, new line characters with a single whitespace (in text nodes).

Example:

<root>
  <node>
    <child id="z">some    xml
    has messed up 
    formatting
    </child>
      
      
    <child id="sometimes      even attributes are messed 
    up">some more     mess</child>
  </node>
</root>

npx xml_normalize --normalize-whitespace will create:

<root>
  <node>
    <child id="z">some xml has messed up formatting</child>
    <child id="sometimes even attributes are messed up">some more mess</child>
  </node>
</root>

Paths for sorting and removing

Paths are a simple subset of XPaths.

/ROOT/NODE_NAME[INDEX]/ANOTHER_NODE

Supported:

  • Only absolute paths
  • Index access (note in XPath indices are 1-based!)
  • Simple predicates using the following functions (parameters can be string (double quotes) or XPaths):
    • starts-with(str,prefix)
    • contains(str,contained)
  • Node wildcard - e.g /root/* to select all nodes in root of any type.
  • Attribute reference in last node - e.g. /root/node/@id.

What is this good for?

This helps to bring xml in a standardized form, so that changes can easily be spotted in diff tool or git pull request.

For example, you could run it as a post processing/pre commit script when re-generating XLIFF translation files (or getting them back from your beloved translator in a messed up form).

Contribute

PRs always welcome :-)

Keywords

FAQs

Package last updated on 13 Oct 2022

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc