Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
meta-tags-parser
Fast and modern meta tags parser (og, twitter, title, description, etc) with snippet support
Readme
Fast, modern, pure python meta tags parser and snippet creator with full support of type annotations, py.typed in basic package and structured output. No jelly dicts, only typed structures!
If you want to see what exactly is social media snippets, look at the example:
pip install meta-tags-parser
from meta_tags_parser import parse_meta_tags_from_source, structs
desired_result: structs.TagsGroup = parse_meta_tags_from_source("""... html source ...""")
# desired_result — is what you want
from meta_tags_parser import parse_tags_from_url, parse_tags_from_url_async, structs
desired_result: structs.TagsGroup = parse_tags_from_url("https://xfenix.ru")
# and async variant
desired_result: structs.TagsGroup = await parse_tags_from_url_async("https://xfenix.ru")
# desired_result — is what you want for both cases
from meta_tags_parser import parse_snippets_from_source, structs
snippet_obj: structs.SnippetGroup = parse_snippets_from_source("""... html source ...""")
# snippet_obj — is what you want
# access like snippet_obj.open_graph.title, ...
from meta_tags_parser import parse_snippets_from_url, parse_snippets_from_url_async, structs
snippet_obj: structs.SnippetGroup = parse_snippets_from_url("https://xfenix.ru")
# and async variant
snippet_obj: structs.SnippetGroup = await parse_snippets_from_url_async("https://xfenix.ru")
# snippet_obj — is what you want
# access like snippet_obj.open_graph.title, ...
Huge note: functions *_from_url
written only for convenience and very error-prone, so any reconnections/error handling — completely on your side.
Also, I don't want to add some bloated requirements to achieve robust connections for any users, because they may simply not await any of this from the library. But if you really need this — write me.
Lets say you want extract snippet for twitter from html page:
from meta_tags_parser import parse_snippets_from_source, structs
my_result: structs.TagsGroup = parse_snippets_from_source("""
<meta property="og:card" content="summary_large_image">
<meta property="og:url" content="https://github.com/">
<meta property="og:title" content="Hello, my friend">
<meta property="og:description" content="Content here, yehehe">
<meta property="twitter:card" content="summary_large_image">
<meta property="twitter:url" content="https://github.com/">
<meta property="twitter:title" content="Hello, my friend">
<meta property="twitter:description" content="Content here, yehehe">
""")
print(my_result)
# What will be printed:
"""
SnippetGroup(
open_graph=SocialMediaSnippet(
title='Hello, my friend',
description='Content here, yehehe',
image='',
url='https://github.com/'
),
twitter=SocialMediaSnippet(
title='Hello, my friend',
description='Content here, yehehe',
image='',
url='https://github.com/'
)
)
"""
# You can access attributes as this
my_result.open_graph.title
my_result.twitter.image
# All fields are necessary and will be always available, even if they have not contain data
# So no need to worry about attributes exsitence (but you may need to check values)
Main function is parse_meta_tags_from_source
. It can be used like this:
from meta_tags_parser import parse_meta_tags_from_source, structs
my_result: structs.TagsGroup = parse_meta_tags_from_source("""... html source ...""")
print(my_result)
# What will be printed:
"""
structs.TagsGroup(
title="...",
twitter=[
structs.OneMetaTag(
name="title", value="Hello",
...
)
],
open_graph=[
structs.OneMetaTag(
name="title", value="Hello",
...
)
],
basic=[
structs.OneMetaTag(
name="title", value="Hello",
...
)
],
other=[
structs.OneMetaTag(
name="article:name", value="Hello",
...
)
]
)
"""
As you can see from this example, we are not using any jelly dicts, only structured dataclasses. Lets see another example:
from meta_tags_parser import parse_meta_tags_from_source, structs
my_result: structs.TagsGroup = parse_meta_tags_from_source("""
<meta property="twitter:card" content="summary_large_image">
<meta property="twitter:url" content="https://github.com/">
<meta property="twitter:title" content="Hello, my friend">
<meta property="twitter:description" content="Content here, yehehe">
""")
print(my_result)
# What will be printed:
"""
TagsGroup(
title='',
basic=[],
open_graph=[],
twitter=[
OneMetaTag(name='card', value='summary_large_image'),
OneMetaTag(name='url', value='https://github.com/'),
OneMetaTag(name='title', value='Hello, my friend'),
OneMetaTag(name='description', value='Content here, yehehe')
],
other=[]
)
"""
for one_tag in my_result.twitter:
if one_tag.name == "title":
print(one_tag.value)
# What will be printed:
"""
Hello, my friend
"""
You can specify what you want to parse:
from meta_tags_parser import parse_meta_tags_from_source, structs
result: structs.TagsGroup = parse_meta_tags_from_source("""... source ...""",
what_to_parse=(WhatToParse.TITLE, WhatToParse.BASIC, WhatToParse.OPEN_GRAPH, WhatToParse.TWITTER, WhatToParse.OTHER)
)
If you reduce this tuple of parsing requirements it may increase overall parsing speed.
og:
and twitter:
from original attributes, and let dataclass structures carry this information. If parser met meta tag with property og:name
, it will be available in my_result
variable as one element of list my_result.open_graph
<title>Something</title>
) will be available as string my_result.title
(of course, you recieve Something
)BASIC_META_TAGS
) will be available as list in my_result.basic
my_result.other
attribute, name of tags will be preserved, unlike og:
/twitter:
behaviourparse_snippets_from_source
functionYou can check https://github.com/xfenix/meta-tags-parser/releases/ release page.
FAQs
Fast and modern meta tags parser (og, twitter, title, description, etc) with snippet support
We found that meta-tags-parser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.