1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-14 15:53:28 +01:00

Localization Support (#647)

* add localization utilities:
- add locmanager to support extract, update, remove, list using pybabel
- add po2csv/csv2po conversion with translate-utils
- docs: add localization.rst to manual!

* add language switch header (via header.html) to all pages if more than one locale is present.

* localization: wrap more text strings in templates in existing templates

* docs:
- document `wb-manager i18n` commands
- mention `<html lang>` setting
- include csv example
- add info about adding localizable text in templates

* add localization to CHANGES
This commit is contained in:
Ilya Kreymer 2021-06-09 13:12:53 -07:00 committed by GitHub
parent 0eedd1502f
commit 12fcc87962
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
16 changed files with 341 additions and 48 deletions

View File

@ -7,6 +7,17 @@ Documentation Updates:
* `New ACL header configuration <https://pywb.readthedocs.io/en/latest/manual/usage.html#config-acl-header>`_
* `Locaalization / Multi-lingual Support Guide <https://pywb.readthedocs.io/en/latest/manual/localization.html>`_
Localization Improvements: (`#647 <https://github.com/webrecorder/pywb/pull/647>`_)
* Support for extracting, updating, listing and removing localizable commands via ``wb-manager i18n`` command.
* UI: Add language switch header to all UI templates.
* Mark localizable strings in translatable in existing templates.
Access Control Improvements:

2
babel.ini Normal file
View File

@ -0,0 +1,2 @@
[jinja2: pywb/templates/**.html]
extensions=jinja2.ext.i18n,jinja2.ext.autoescape,jinja2.ext.with_

View File

@ -17,3 +17,6 @@ enable_memento: true
# Replay content in an iframe
framed_replay: true
locales:
- en
- es

View File

@ -18,6 +18,7 @@ A subset of features provides the basic functionality of a "Wayback Machine".
manual/configuring
manual/access-control
manual/ui-customization
manual/localization
manual/architecture
manual/apis
manual/owb-transition

View File

@ -0,0 +1,147 @@
.. _localizaation:
Localization / Multi-lingual Support
------------------------------------
pywb supports configuring different language locales and loading different language translations, and dynamically switching languages.
pywb can extract all text from templates and generate CSV files for translation and convert them back into a binary format used for localization/internationalization.
(pywb uses the `Babel library <http://babel.pocoo.org/en/latest/>`_ which extends the `standard Python i18n system <https://docs.python.org/3/library/gettext.html>`_)
Locales to use are configured in the ``config.yaml``.
The command-line ``wb-manager`` utility provides a way to manages locales for translation, including generatin extracted text, update translated text.
Adding a Locale and Extracting Text
===================================
To add a new locale for translation and automatically extract all text that needs to be translated, run::
wb-manager i18n extract <loc>
The ``<loc>`` can be one or more supported two-letter locales or CLDR language codes. To list available codes, you can run ``pybabel --list-locales``.
Localization data is placed in the ``i18n`` directory, and translatable strings can be found in ``i18n/translations/<locale>/LC_MESSAGES/messages.csv``
Each CSV file looks as follows, listing source string and an empty string for the translated version::
"location","source","target"
"pywb/templates/banner.html:6","Live on",""
"pywb/templates/banner.html:8","Calendar icon",""
"pywb/templates/banner.html:9 pywb/templates/query.html:45","View All Captures",""
"pywb/templates/banner.html:10 pywb/templates/header.html:4","Language:",""
"pywb/templates/banner.html:11","Loading...",""
...
This CSV can then be passed to translators to translate the text.
(The extraction parameters arae configured to load data from ``pywb/templates/*.html`` in ``babel.ini``)
For example, the following will generate translation strings for ``es`` and ``pt`` locales::
wb-manager i18n extract es pt
The translatable text can then be found in ``i18n/translations/es/LC_MESSAGES/messages.csv`` and ``i18n/translations/pt/LC_MESSAGES/messages.csv``.
The CSV files should be updated with a translation for each string in the target column.
The extract commannd add any new strings without overwriting existing translations, so it is safe to run multiple times.
Updating Locale Catalog
=======================
Once the text has been translated, and the CSV files updated, simply run::
wb-manager i18n update <loc>
This will parse the CSVs and compile the translated string tables for use with pywb.
Specifying locales in pywb
==========================
To enable the locales in pywb, add one or more locales can be added to the ``locales`` key in ``config.yaml``, ex::
locales:
- en
- es
Single Language Default Locale
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pywb can be configured with a default, single-language locale, by setting the ``default_locale`` property in ``config.yaml``::
default_locale: es
locales:
- es
With this configuration, pywb will automatically use the ``es`` locale for all text strings in pywb pages.
pywb will also set the ``<html lang="es">`` so that the browser will recognize the correct locale.
Mutli-language Translations
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If more than one locale is specified, pywb will automatically show a language switching UI at the top of collection and search pages, with an option
for each locale listed. To include English as an option, it should also be added as a locale (and no strings translated). For example::
locales:
- en
- es
- pt
will configure pywb to show a language switch option on all pages.
Localized Collection Paths
==========================
When localization is enabled, pywb supports the locale prefix for accessing each collection with a localized language:
If pywb has a collection ``my-web-archive``, then:
* ``/my-web-archive/`` - loads UI with default language (set via ``default_locale``)
* ``/en/my-web-archive/`` - loads UI with ``en`` locale
* ``/es/my-web-archive/`` - loads UI with ``es`` locale
* ``/pt/my-web-archive/`` - loads UI with ``pt`` locale
The language switch options work by changing the locale prefix for the same page.
Listing and Removing Locales
============================
To list the locales that have previously been added, you can also run ``wb-manager i18n list``.
To disable a locale from being used in pywb, simply remove it from the ``locales`` key in ``config.yaml``
To remove data for a locale permanently, you can run: ``wb-manager i18n remove <loc>``. This will remove the locale directory on disk.
To remove all localization data, you can manually delete the ``i18n`` directory.
UI Templates: Adding Localizable Text
=====================================
Text that can be translated, localizable text, can be marked as such directly in the UI templates:
1. By wrapping the text in ``{% trans %}``/``{% endtrans %}`` tags. For example::
{% trans %}Collection {{ coll }} Search Page{% endtrans %}
2. Short-hand by calling a special ``_()`` function, which can be used in attributes or more dynamically. For example::
... title="{{ _('Enter a URL to search for') }}">
These methods can be used in all UI templates and are supported by the Jinja2 templating system.
See :ref:`ui-customizations` for a list of all available UI templates.

View File

@ -5,3 +5,4 @@ uwsgi
ujson
pysocks
lxml
translate_toolkit

View File

@ -72,7 +72,8 @@ class RewriterApp(object):
self.jinja_env.init_loc(self.config.get('locales_root_dir'),
self.config.get('locales'),
self.loc_map)
self.loc_map,
self.config.get('default_locale'))
self.redirect_to_exact = config.get('redirect_to_exact')

109
pywb/manager/locmanager.py Normal file
View File

@ -0,0 +1,109 @@
import os
import os.path
import shutil
from babel.messages.frontend import CommandLineInterface
from translate.convert.po2csv import main as po2csv
from translate.convert.csv2po import main as csv2po
ROOT_DIR = 'i18n'
TRANSLATIONS = os.path.join(ROOT_DIR, 'translations')
MESSAGES = os.path.join(ROOT_DIR, 'messages.pot')
# ============================================================================
class LocManager:
def process(self, r):
if r.name == 'list':
r.loc_func(self)
elif r.name == 'remove':
r.loc_func(self, r.locale)
else:
r.loc_func(self, r.locale, r.no_csv)
def extract_loc(self, locale, no_csv):
self.extract_text()
for loc in locale:
loc_dir = os.path.join(TRANSLATIONS, loc)
if os.path.isdir(loc_dir):
self.update_catalog(loc)
else:
os.makedirs(loc_dir)
self.init_catalog(loc)
if not no_csv:
base = os.path.join(TRANSLATIONS, loc, 'LC_MESSAGES')
po = os.path.join(base, 'messages.po')
csv = os.path.join(base, 'messages.csv')
po2csv([po, csv])
def update_loc(self, locale, no_csv):
for loc in locale:
if not no_csv:
loc_dir = os.path.join(TRANSLATIONS, loc)
base = os.path.join(TRANSLATIONS, loc, 'LC_MESSAGES')
po = os.path.join(base, 'messages.po')
csv = os.path.join(base, 'messages.csv')
if os.path.isfile(csv):
csv2po([csv, po])
self.compile_catalog()
def remove_loc(self, locale):
for loc in locale:
loc_dir = os.path.join(TRANSLATIONS, loc)
if not os.path.isdir(loc_dir):
print('Locale "{0}" does not exist'.format(loc))
return
shutil.rmtree(loc_dir)
print('Removed locale "{0}"'.format(loc))
def list_loc(self):
print('Current locales:')
print('\n'.join(' - ' + x for x in os.listdir(TRANSLATIONS)))
print('')
def extract_text(self):
os.makedirs(ROOT_DIR, exist_ok=True)
CommandLineInterface().run(['pybabel', 'extract', '-F', 'babel.ini', '-k', '_ _Q gettext ngettext', '-o', MESSAGES, './', '--omit-header'])
def init_catalog(self, loc):
CommandLineInterface().run(['pybabel', 'init', '-l', loc, '-i', MESSAGES, '-d', TRANSLATIONS])
def update_catalog(self, loc):
CommandLineInterface().run(['pybabel', 'update', '-l', loc, '-i', MESSAGES, '-d', TRANSLATIONS, '--previous'])
def compile_catalog(self):
CommandLineInterface().run(['pybabel', 'compile', '-d', TRANSLATIONS])
@classmethod
def init_parser(cls, parser):
"""Initializes an argument parser for acl commands
:param argparse.ArgumentParser parser: The parser to be initialized
:rtype: None
"""
subparsers = parser.add_subparsers(dest='op')
subparsers.required = True
def command(name, func):
op = subparsers.add_parser(name)
if name != 'list':
op.add_argument('locale', nargs='+')
if name != 'remove':
op.add_argument('--no-csv', action='store_true')
op.set_defaults(loc_func=func, name=name)
command('extract', cls.extract_loc)
command('update', cls.update_loc)
command('remove', cls.remove_loc)
command('list', cls.list_loc)

View File

@ -441,6 +441,17 @@ Create manage file based web archive collections
ACLManager.init_parser(acl)
acl.set_defaults(func=do_acl)
# LOC
from pywb.manager.locmanager import LocManager
def do_loc(r):
loc = LocManager()
loc.process(r)
loc_help = 'Generate strings for i18n/localization'
loc = subparsers.add_parser('i18n', help=loc_help)
LocManager.init_parser(loc)
loc.set_defaults(func=do_loc)
# Parse
r = parser.parse_args(args=args)
r.func(r)

View File

@ -98,6 +98,8 @@ class JinjaEnv(object):
assets_env.resolver = PkgResResolver()
jinja_env.assets_environment = assets_env
self.default_locale = ''
def _make_loaders(self, paths, packages):
"""Initialize the template loaders based on the supplied paths and packages.
@ -117,16 +119,19 @@ class JinjaEnv(object):
return loaders
def init_loc(self, locales_root_dir, locales, loc_map):
def init_loc(self, locales_root_dir, locales, loc_map, default_locale):
locales = locales or []
locales_root_dir = locales_root_dir or os.path.join('i18n', 'translations')
default_locale = default_locale or 'en'
self.default_locale = default_locale
if locales_root_dir:
for loc in locales:
loc_map[loc] = Translations.load(locales_root_dir, [loc, 'en'])
loc_map[loc] = Translations.load(locales_root_dir, [loc, default_locale])
#jinja_env.jinja_env.install_gettext_translations(translations)
def get_translate(context):
loc = context.get('env', {}).get('pywb_lang')
loc = context.get('env', {}).get('pywb_lang', default_locale)
return loc_map.get(loc)
def override_func(jinja_env, name):
@ -160,6 +165,7 @@ class JinjaEnv(object):
self.jinja_env.globals['locales'] = list(loc_map.keys())
self.jinja_env.globals['_Q'] = quote_gettext
self.jinja_env.globals['default_locale'] = default_locale
@contextfunction
def switch_locale(context, locale):

View File

@ -182,7 +182,7 @@ This file is part of pywb, https://github.com/webrecorder/pywb
ancillaryLinks.appendChild(calendarLink);
this.calendarLink = calendarLink;
if (typeof window.banner_info.locales !== "undefined" && window.banner_info.locales.length) {
if (typeof window.banner_info.locales !== "undefined" && window.banner_info.locales.length > 1) {
var locales = window.banner_info.locales;
var languages = document.createElement("div");
@ -317,4 +317,4 @@ This file is part of pywb, https://github.com/webrecorder/pywb
}
}
})();
})();

View File

@ -9,6 +9,7 @@
<!-- jquery and bootstrap dependencies query view -->
<link rel="stylesheet" href="{{ static_prefix }}/css/bootstrap.min.css"/>
<link rel="stylesheet" href="{{ static_prefix }}/css/font-awesome.min.css">
<link rel="stylesheet" href="{{ static_prefix }}/css/base.css">
<script src="{{ static_prefix }}/js/jquery-latest.min.js"></script>
<script src="{{ static_prefix }}/js/bootstrap.min.js"></script>

View File

@ -1,5 +1,5 @@
{% extends "base.html" %}
{% block title %}Pywb Error{% endblock %}
{% block title %}{{ _('Pywb Error') }}{% endblock %}
{% block body %}
<div class="container text-danger">
<div class="row justify-content-center">
@ -8,22 +8,22 @@
<div class="row">
<div class="col-12 text-center">
{% if err_status == 451 %}
<p class="lead">Access Blocked to {{ err_msg }}</p>
<p class="lead">{% trans %}Access Blocked to {{ err_msg }}{% endtrans %}</p>
{% elif err_status == 404 and err_details == 'coll_not_found' %}
<p>Collection not found: <b>{{ err_msg }}</b></p>
<p>{% trans %}Collection not found: <b>{{ err_msg }}{% endtrans %}</b></p>
<p><a href="/">See list of valid collections</a></p>
<p><a href="/">{{ _('See list of valid collections') }}</a></p>
{% elif err_status == 404 and err_details == 'static_file_not_found' %}
<p>Static file not found: <b>{{ err_msg }}</b></p>
<p>{% trans %}Static file not found: <b>{{ err_msg }}{% endtrans %}</b></p>
{% else %}
<p class="lead">{{ err_msg }}</p>
{% if err_details %}
<p class="lead">Error Details:</p>
<p class="lead">{% trans %}Error Details:{% endtrans %}</p>
<pre>{{ err_details }}</pre>
{% endif %}
{% endif %}

View File

@ -3,7 +3,7 @@
<div class="container">
<div class="row">
<h2 class="display-2">{{ _('Pywb Wayback Machine') }}</h2>
<p class="lead">This archive contains the following collections:</p>
<p class="lead">{{ _('This archive contains the following collections:') }}</p>
</div>
<div class="row">
<ul>

View File

@ -1,6 +1,6 @@
{% extends "base.html" %}
{% block title %}URL Not Found{% endblock %}
{% block title %}{{ _('URL Not Found') }}{% endblock %}
{% block body %}
<div class="container">
@ -13,7 +13,7 @@
{% if wbrequest and wbrequest.env.pywb_proxy_magic and url %}
<p>
<a href="//select.{{ wbrequest and wbrequest.env.pywb_proxy_magic }}/{{ url }}">
Try Different Collection
{{ _('Try Different Collection') }}
</a>
</p>
{% endif %}

View File

@ -13,7 +13,7 @@ window.wb_prefix = "{{ wb_prefix }}";
<div class="container-fluid">
<div class="row justify-content-center">
<h4 class="display-4">
Collection {{ coll }} Search Page
{% trans %}Collection {{ coll }} Search Page{% endtrans %}
</h4>
</div>
</div>
@ -27,9 +27,9 @@ window.wb_prefix = "{{ wb_prefix }}";
</label>
<input aria-label="url" aria-required="true" class="form-control form-control-lg" id="search-url"
name="search" placeholder="Enter a URL to search for"
title="Enter a URL to search for" type="search" required/>
title="{{ _('Enter a URL to search for') }}" type="search" required/>
<div class="invalid-feedback">
Please enter a URL
{% trans %}'Please enter a URL{% endtrans %}
</div>
</div>
</div>
@ -37,7 +37,7 @@ window.wb_prefix = "{{ wb_prefix }}";
<div class="col-5">
<div class="custom-control custom-checkbox custom-control">
<input type="checkbox" class="custom-control-input" id="open-results-new-window">
<label class="custom-control-label" for="open-results-new-window">Open results in new window</label>
<label class="custom-control-label" for="open-results-new-window">{{ _('Open results in new window') }}</label>
</div>
</div>
<div class="col-7">
@ -47,51 +47,51 @@ window.wb_prefix = "{{ wb_prefix }}";
<button class="btn btn-outline-info float-right mr-3" type="button" role="button"
data-toggle="collapse" data-target="#advancedOptions"
aria-expanded="false" aria-controls="advancedOptions" aria-label="Advanced Search Options">
Advanced Search Options
{{ _('Advanced Search Options') }}
</button>
</div>
</div>
<div class="collapse mt-3" id="advancedOptions">
<div class="form-group form-row">
<label for="match-type-select" class="col-sm-2 col-form-label" aria-label="Match Type">
Match Type:
{{ _('Match Type:') }}
</label>
<select id="match-type-select" class="form-control form-control col-sm-6">
<option value=""></option>
<option value="prefix">Prefix</option>
<option value="host">Host</option>
<option value="domain">Domain</option>
<option value="prefix">{% trans %}Prefix{% endtrans %}</option>
<option value="host">{% trans %}Host{% endtrans %}</option>
<option value="domain">{% trans %}Domain{% endtrans %}</option>
</select>
</div>
<p style="cursor: help;">
<span data-toggle="tooltip" data-placement="right"
title="Restricts the results to the given date/time range (inclusive)">
Date/Time Range
{{ _('Date/Time Range') }}
</span>
</p>
<div class="form-row">
<div class="col-6">
<label class="sr-only" for="dt-from" aria-label="Date/Time Range From">From:</label>
<label class="sr-only" for="dt-from" aria-label="Date/Time Range From">{% trans %}From:{% endtrans %}</label>
<div class="input-group">
<div class="input-group-prepend">
<div class="input-group-text">From:</div>
<div class="input-group-text">{% trans %}From:{% endtrans %}</div>
</div>
<input id="dt-from" type="number" name="date-range-from" class="form-control"
pattern="^\d{4,14}$">
<div class="invalid-feedback" id="dt-from-bad">
Please enter a valid <b>From</b> timestamp. Timestamps may be 4 <= ts <=14 digits
{% trans %}Please enter a valid <b>From</b> timestamp. Timestamps may be 4 <= ts <=14 digits{% endtrans %}
</div>
</div>
</div>
<div class="col-6">
<label class="sr-only" for="dt-to" aria-label="Date/Time Range To">To:</label>
<label class="sr-only" for="dt-to" aria-label="Date/Time Range To">{% trans %}To:{% endtrans %}</label>
<div class="input-group">
<div class="input-group-prepend">
<div class="input-group-text">To:</div>
<div class="input-group-text">{% trans %}To:{% endtrans %}</div>
</div>
<input id="dt-to" type="number" name="date-range-to" class="form-control" pattern="^\d{4,14}$">
<div class="invalid-feedback" id="dt-to-bad">
Please enter a valid <b>To</b> timestamp. Timestamps may be 4 <= ts <=14 digits
{% trans %}Please enter a valid <b>To</b> timestamp. Timestamps may be 4 <= ts <=14 digits{% endtrans %}
</div>
</div>
</div>
@ -99,41 +99,41 @@ window.wb_prefix = "{{ wb_prefix }}";
<div class="form-group mt-3">
<div class="form-row">
<div class="col-6">
<p>Filtering</p>
<p>{% trans %}Filtering{% endtrans %}</p>
</div>
<div class="col-6">
<button id="clear-filters" class="btn btn-outline-warning float-right" type="button">
Clear Filters
{% trans %}Clear Filters{% endtrans %}
</button>
<button id="add-filter" class="btn btn-outline-secondary float-right mr-2" type="button">
Add Filter
{% trans %}Add Filter{% endtrans %}
</button>
</div>
</div>
<div class="form-row">
<div class="col-6">
<div class="row pb-1">
<label for="filter-by" class="col-form-label col-3">By:</label>
<label for="filter-by" class="col-form-label col-3">{% trans %}By:{% endtrans %}</label>
<select id="filter-by" class="form-control col-7">
<option value="" selected></option>
<option value="mime">Mime Type</option>
<option value="status">Status</option>
<option value="url">URL</option>
<option value="mime">{% trans %}Mime Type{% endtrans %}</option>
<option value="status">{% trans %}Status{% endtrans %}</option>
<option value="url">{% trans %}URL{% endtrans %}</option>
</select>
</div>
<div class="row pb-1">
<label for="filter-modifier" class="col-form-label col-3">How:</label>
<label for="filter-modifier" class="col-form-label col-3">{% trans %}How:{% endtrans %}</label>
<select id="filter-modifier" class="form-control col-7">
<option value="=">Contains</option>
<option value="==">Matches Exactly</option>
<option value="=~">Matches Regex</option>
<option value="=!">Does Not Contains</option>
<option value="=!=">Is Not</option>
<option value="=!~">Does Not Begins With</option>
<option value="=">{% trans %}Contains{% endtrans %}</option>
<option value="==">{% trans %}Matches Exactly{% endtrans %}</option>
<option value="=~">{% trans %}Matches Regex{% endtrans %}</option>
<option value="=!">{% trans %}Does Not Contain{% endtrans %}</option>
<option value="=!=">{% trans %}Is Not{% endtrans %}</option>
<option value="=!~">{% trans %}Does Not Begins With{% endtrans %}</option>
</select>
</div>
<div class="row">
<label for="filter-expression" class="col-form-label col-3">Expr:</label>
<label for="filter-expression" class="col-form-label col-3">{% trans %}Expr:{% endtrans %}</label>
<input type="text" id="filter-expression" class="form-control col-7"
placeholder="Enter an expression to filter by"
>
@ -141,7 +141,7 @@ window.wb_prefix = "{{ wb_prefix }}";
</div>
<div class="col-6">
<ul id="filter-list" class="filter-list">
<li id="filtering-nothing">No Filter</li>
<li id="filtering-nothing">{% trans %}No Filter{% endtrans %}</li>
</ul>
</div>
</div>
@ -151,7 +151,7 @@ window.wb_prefix = "{{ wb_prefix }}";
</div>
{% if metadata %}
<div class="container mt-4 justify-content-center">
<p class="lead">Collection Metadata</p>
<p class="lead">{{ _('Collection Metadata') }}</p>
<div class="row">
<div class="col-4 pr-1">
<div class="list-group" id="collection-metadata" role="tablist">