New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

wordcanvas-docsaid

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

wordcanvas-docsaid

Generating text with custom fonts and styles.

2.1.0
PyPI

Maintainers: 1

English | 中文

WordCanvas

Introduction

This project is a text image rendering tool based on Pillow, designed specifically for random image generation.

By adding a variety of parameter settings, users can flexibly adjust input text, font styles, colors, and other attributes to achieve large-scale random generation of text images. Whether addressing issues such as data scarcity, class imbalance, or enhancing image diversity, WordCanvas provides a simple and efficient solution, offering a solid data foundation for deep learning training.

Technical Documentation

For installation and usage instructions, please refer to WordCanvas Documents.

There you will find detailed information about the project.

Installation

Install via PyPI

Install wordcanvas-docsaid:
```
pip install wordcanvas-docsaid
```

Verify installation:

python -c "import wordcanvas; print(wordcanvas.__version__)"

If you see the version number, the installation is successful.

Install from GitHub

Clone the project from GitHub:

git clone https://github.com/DocsaidLab/WordCanvas.git

Install the wheel package:
```
pip install wheel setuptools
```

Build the whl file:

cd WordCanvas
python setup.py bdist_wheel

Install the whl file:

pip install dist/wordcanvas_docsaid-*-py3-none-any.whl

Quick Start

The hardest part is getting started, so we need a simple beginning.

Start with a String

Simply provide a basic declaration and you're ready to go.

from wordcanvas import WordCanvas

gen = WordCanvas(return_infos=True)

Using all default settings, you can directly call the function to generate a text image.

text = "你好！Hello, World!"
img, infos = gen(text)

print(img.shape)
# >>> (67, 579, 3)

print(infos)
# {'text': '你好！Hello, World!',
#  'bbox(xyxy)': (0, 21, 579, 88),
#  'bbox(wh)': (579, 67),
#  'offset': (0, -21),
#  'direction': 'ltr',
#  'background_color': (0, 0, 0),
#  'text_color': (255, 255, 255),
#  'spacing': 4,
#  'align': 'left',
#  'stroke_width': 0,
#  'stroke_fill': (0, 0, 0),
#  'font_path': 'fonts/NotoSansTC-Regular.otf',
#  'font_size_actual': 64,
#  'font_name': 'NotoSansTC-Regular',
#  'align_mode': <AlignMode.Left: 0>,
#  'output_direction': <OutputDirection.Remain: 0>}

sample1

[!TIP] In default mode, the output image size depends on:

Font size: The default is 64. As the font size increases, the image size will also increase.
Text length: The longer the text, the wider the image will be, with the exact length determined by pillow.
The output information infos contains all drawing parameters, including text, background color, text color, etc.
To output only the image, set return_infos=False, which is the default setting.

Specify a Specific Font

You can specify your preferred font using the font parameter.

from wordcanvas import WordCanvas

# Do not specify return_infos, default is False, which will not return infos
gen = WordCanvas(
    font_path="/path/to/your/font/OcrB-Regular.ttf"
)

text = 'Hello, World!'
img = gen(text)

sample14

If the font does not support the input text, tofu characters will appear.

text = 'Hello, 中文!'
img = gen(text)

sample15

[!TIP]

How to Check if the Font Supports Characters:

I currently don’t have this requirement, so I’ve left a basic method. This is a simple check that only checks one character at a time, so you need to iterate through all the characters. If you have other requirements, feel free to expand on this.
from wordcanvas import is_character_supported, load_ttfont

target_text = 'Hello, 中文!'

font = load_ttfont("/path/to/your/font/OcrB-Regular.ttf")

for c in target_text:
    status = is_character_supported(font, c)

# >>> Character '中' (0x4e2d) is not supported by the font.
# >>> Character '文' (0x6587) is not supported by the font.

Set Image Size

You can adjust the image size using the output_size parameter.

from wordcanvas import WordCanvas

gen = WordCanvas(output_size=(64, 1024)) # Height 64, Width 1024
img = gen(text)
print(img.shape)
# >>> (64, 1024, 3)

sample4

When the specified size is smaller than the text image size, the text image will be automatically scaled.

In other words, the text will be squeezed together, becoming a narrow rectangle, for example:

from wordcanvas import WordCanvas

text = '你好' * 10
gen = WordCanvas(output_size=(64, 512))  # Height 64, Width 512
img = gen(text)

sample8

Adjust Background Color

You can adjust the background color using the background_color parameter.

from wordcanvas import WordCanvas

gen = WordCanvas(background_color=(255, 0, 0)) # Red background
img = gen(text)

sample2

Adjust Text Color

You can adjust the text color using the text_color parameter.

from wordcanvas import WordCanvas

gen = WordCanvas(text_color=(0, 255, 0)) # Green text
img = gen(text)

sample3

Adjust Text Alignment

[!WARNING] Do you remember the image size mentioned earlier?

By default, setting text alignment does not have any effect. When drawing the image, there must be extra space in the text image to see the alignment effect.

You can adjust the text alignment using the align_mode parameter.

from wordcanvas import AlignMode, WordCanvas

gen = WordCanvas(
    output_size=(64, 1024),
    align_mode=AlignMode.Center
)

text = '你好！ Hello, World!'
img = gen(text)

Center alignment: AlignMode.Center
Right alignment: AlignMode.Right
Left alignment: AlignMode.Left
Scatter alignment: AlignMode.Scatter

[!TIP]

In scatter alignment mode, not every character will be spread out, but words will be spread as a unit. In Chinese, the unit of a word is a character; in English, the unit of a word is a space.

For example, the input text "你好！ Hello, World!" will be split into:

["你", "好", "！", "Hello,", "World!"]

Spaces are ignored, and scatter alignment is applied.

Additionally, when the input text can only be split into a single word, scatter alignment for Chinese words is equivalent to center alignment, and English words will be split into individual characters for scatter alignment.

The logic we use for this is:
def split_text(text: str):
    """ Split text into a list of characters. """
    pattern = r"[a-zA-Z0-9\p{P}\p{S}]+|."
    matches = regex.findall(pattern, text)
    matches = [m for m in matches if not regex.match(r'\p{Z}', m)]
    if len(matches) == 1:
        matches = list(text)
    return matches

[!WARNING] This is a very simple implementation and may not meet all requirements. If you have a more complete solution for string splitting, feel free to provide it.

Adjust Text Direction

You can adjust the text direction using the direction parameter.

Output horizontal text

from wordcanvas import AlignMode, WordCanvas

text = '你好！'
gen = WordCanvas(direction='ltr') # Left to right horizontal text
img = gen(text)

sample9

Output vertical text

from wordcanvas import AlignMode, WordCanvas

text = '你好！'
gen = WordCanvas(direction='ttb') # Top to bottom vertical text
img = gen(text)

sample10

Output vertical text with scatter alignment

from wordcanvas import AlignMode, WordCanvas

text = '你好！'
gen = WordCanvas(
    direction='ttb',
    align_mode=AlignMode.Scatter,
    output_size=(64, 512)
)
img = gen(text)

sample11

Adjust Output Direction

You can adjust the output direction using the output_direction parameter.

[!TIP]

When to use this parameter: When you choose "Output vertical text" but still want to view the text image horizontally, you can use this parameter.

Vertical text, horizontal output

from wordcanvas import OutputDirection, WordCanvas

gen = WordCanvas(
    direction='ttb',
    output_direction=OutputDirection.Horizontal
)

text = '你好！'
img = gen(text)

sample12

Horizontal text, vertical output

from wordcanvas import OutputDirection, WordCanvas

gen = WordCanvas(
    direction='ltr',
    output_direction=OutputDirection.Vertical
)

text = '你好！'
img = gen(text)

sample13

Squeeze Text

In some scenarios, the text might need to be specially squeezed. You can use the text_aspect_ratio parameter to adjust this.

from wordcanvas import WordCanvas

gen = WordCanvas(
    text_aspect_ratio=0.25, # Text height / text width = 1/4
    output_size=(32, 1024),
)  # Squeezed text

text="壓扁測試"
img = gen(text)

sample16

[!IMPORTANT] It is important to note that if the squeezed text size exceeds the output_size, the image will go through an automatic scaling process. Therefore, you might end up squeezing the text, but it will be scaled back to its original size, and nothing will appear to have happened.

Text Stroke

You can adjust the width of the text stroke using the stroke_width parameter.

from wordcanvas import WordCanvas

gen = WordCanvas(
    font_size=64,
    text_color=(0, 0, 255), # Red text
    background_color=(255, 0, 0), # Blue background
    stroke_width=2, # Stroke width
    stroke_fill=(0, 255, 0), # Green stroke
)

text="文字外框測試"
img = gen(text)

sample29

[!WARNING] Using stroke_width may result in an OSError: array allocation size too large error with certain text.
Using `stroke_width` may cause an OSError: array allocation size too large error with certain text.
This is a known issue with the `Pillow` library (see https://github.com/python-pillow/Pillow/issues/7287) and cannot be resolved directly.
We found in testing that using stroke_width in Pillow may intermittently result in an OSError. This is a known issue with Pillow, and we have linked the relevant issue in the warning. You can click the link to view it.

Multi-line Text

You can use the \n newline character to create multi-line text.

from wordcanvas import WordCanvas

gen = WordCanvas()

text = '你好！\nHello, World!'
img = gen(text)

sample30

In the case of multi-line text, you can combine it with most of the features mentioned above, for example:

from wordcanvas import WordCanvas, AlignMode

gen = WordCanvas(
  text_color=(0, 0, 255), # Red text
  output_size=(128, 512), # Height 128, Width 512
  background_color=(0, 0, 0), # Black background
  align_mode=AlignMode.Center, # Center alignment
  stroke_width=2, # Stroke width
  stroke_fill=(0, 255, 0), # Green stroke
)

text = '你好！\nHello, World!'
img = gen(text)

sample31

[!WARNING]

The following situations do not support multi-line text:
align_mode does not support AlignMode.Scatter
gen = WordCanvas(align_mode=AlignMode.Scatter)
direction does not support ttb
 gen = WordCanvas(direction='ttb')
If you need these features, please avoid using multi-line text.

Dashboard

The basic functionality is more or less as described.

Finally, let's introduce the dashboard feature.

from wordcanvas import WordCanvas

gen = WordCanvas()
print(gen)

You can also directly output without using print, as we have implemented the __repr__ method.

Once output, you will see a simple dashboard.

dashboard

Here you can see:

The first column is Property, which lists all the setting parameters.
The second column is Current Value, showing the current value of the parameter.
The third column is SetMethod, which shows how the parameter is set.
- Parameters set with set can be directly modified;
- Parameters set with reinit require reinitialization of the WordCanvas object.
The fourth column is DType, which shows the data type of the parameter.
The fifth column is Description, which describes the parameter. (This column is not shown in the image above to save space.)

Most parameters can be directly set, meaning that when you need to modify output characteristics, you don't need to create a new object. Just modify the settings directly. Parameters that require reinit usually involve the initialization of font formats, such as text height (font_size) and others.

gen.output_size = (64, 1024)
gen.text_color = (0, 255, 0)
gen.align_mode = AlignMode.Center
gen.direction = 'ltr'
gen.output_direction = OutputDirection.Horizontal

After setting, you can directly call the function to get the new text image. Additionally, if you modify parameters related to reinit, you will receive the corresponding error:

AttributeError: can't set attribute

gen.font_size = 128
# >>> AttributeError: can't set attribute

[!CAUTION]

Of course, you can still forcefully modify the parameters, but as a fellow Python user, I can't stop you:
gen._font_size = 128
However, this will cause errors when generating the image later!

Don't insist; just reinitialize the object.

Summary

Many features haven't been mentioned, but the basic functionality has been covered.

This concludes the basic usage of the project. For more detailed information and usage methods, please refer directly to WordCanvas Advanced Usage.

Citation

If you find our work helpful, please cite the following:

@misc{yuan2024wordcanvas,
  author = {Ze Yuan},
  title = {WordCanvas},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/DocsaidLab/WordCanvas}}
}

FAQs

What is wordcanvas-docsaid?

Is wordcanvas-docsaid well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

wordcanvas-docsaid

WordCanvas

Introduction

Technical Documentation

Installation

Install via PyPI

Install from GitHub

Quick Start

Start with a String

Specify a Specific Font

Set Image Size

Adjust Background Color

Adjust Text Color

Adjust Text Alignment

Adjust Text Direction

Adjust Output Direction

Squeeze Text

Text Stroke

Multi-line Text

Dashboard

Summary

Citation

Related posts

TC39 Advances 3 Proposals to Stage 4: RegExp Escaping, Float16Array, and Redeclarable global eval vars

Deno 2.2 Improves Dependency Management and Expands Node.js Compatibility