Skip to main content

Me and RegEx

 

What is RegEx?

Regex or Regular Expressions is not a programming language but more pattern identification. Its main purpose is quality check in translated or other texts and documents. According to Riccardo Schiaffino,

RegEx "is a search-and-replace function on steroids. Regular expressions can assist our translation work by allowing us to search, replace, and filter text in ways that would otherwise be impossible in our software tools." (https://www.ata-chronicle.online/highlights/regular-expressions-an-introduction-for-translators/

If you are a linguist or have some affinity for languages, you will pick up regex quickly, after some trials and errors.:)

For us, translators, regex is important because CAT tools use regular expressions for creating segmentation and auto-translation rules.

See below my first attempts to create some basic rules that can be used for Hungarian translations.

RegEx for English to Hungarian Translations

Example 1: Hungarian (or other names) with more than 1 space between them

Regular Expression: a-záéúőóüö.[A-ZÁÉÚŐÓÜÖ]

Explanation: This regex looks for one or more spaces between words that follow each other with capital letters including Hungarian characters or common Latin characters. It is designed particularly for checking Hungarian and English proper names that contain 2 or more components. Note that the extra space between regular words (lower cases) was not picked up.

The Regular expression first was checked in regex101.com:


As you can see it, it picked up all the extra spaces between the names regardless of whether they contained 2 or more elements or a period between them. (I just realized, this regex can be used also to check if there is an extra space between sentences that end with a period including the ones that start with Hungarian letters which is super helpful and definitely broadens its usage!)

I added the regex in Trados and with the Verify option, it gave me warnings for extra spaces. (Please note that I had some formatting issues with how they were displayed in Trados and placed in segments but this is just another confirmation that the Regular Expression works also to pick up extra spaces between everything that ends with any character or a period and starts with a capital letter.)

Example 2: English and other quotation marks replaced with Hungarian (lower and upper) quotation marks

Regular Expression: ("|'|<|>|‘|“)(.*)("|'|<|>|’|”) Substitution: „$2”

Explanation: It's common to leave English upper quotation marks in translated texts simply because they don't have a direct way to put them into the text in Hungarian, but they are considered to be grammatically incorrect. This expression looks for segments that start or end with other than Hungarian lower and upper quotation marks including ", ', ‘, ’. “, ”, <, >. The replacement changes them to start with a lower quotation mark and ends with the upper quotation mark. Note: The French quotation mark was not included because Hungarian uses them, too.


The Regular Expression was checked in regex101.com It picked up all the wrong quotation marks and left the Hungarian and French. The substitution replaced them all with Hungarian quotation marks.

In Trados, I used the Replace option, included the regex and substitution, and again, it picked up the wrong quotation marks all the way, and with the Replace or Replace All I could change all of them to the Hungarian one.

Have fun!

Contact me:


Comments

Popular posts from this blog

Why I switched to Google Sites?

In the last few months, I have successfully rebuilt and relocated all my small business websites to Google Sites. You may wonder why, and what services I used it before. Here is my story. I am a graduate student in the Middlebury Institute of International Studies at Monterey in the Translation and Localization Management program, and this spring semester I took Website Localization classes. Just to give you an idea of what my studies are about, localization refers to the process of adapting content related to an idea, service, or product to the language and culture of a specific market or region. During the course, we were exposed to a wide variety of technologies and tools, such as the fundamental web technologies of HTML, CSS, and JavaScript, as well as dynamic WordPress websites localized with Polylang and WPML, Drupal websites, translating with SEO in mind, and advanced topics such as Node.js and PHP. We have touched upon the fundamental tools of dynamic website creation, websit

Fearless Workplace and Psychological Safety

  We live in a world where success is a matter of solving problems and coming up with the next big idea. It’s not enough anymore to be smart and hardworking. Organizations need their employees to collaborate, experiment and respond to their business needs that are constantly changing. But in many workplaces, people lack the confidence to do this silenced by fear and failure, judgmental colleagues, or unapproachable bosses. When leaders use fear to motivate, people can turn to extremes and dangerous methods to get the job done. And when fear gets in the way of people speaking up at work, it’s not only the individuals who miss out. No one wakes up in the morning saying and thinking that as soon as they arrive at the office, they want to look ignorant, incompetent, intrusive or negative. Well, it’s easy to manage that: if you don’t want to look ignorant, don’t ask questions. If you don’t want to look incompetent, don’t admit your mistakes and failures. If you don’t want to appear incompet

Discussion: Managing Stakeholders

A point of view of a freelance localizer and translator Who was the stakeholder and what was their role? For this discussion, I am going to use my freelancing experiences and use my clients as stakeholders. These clients are mainly LSPs and within that, primarily need to deal with translation project managers or project coordinators. In some cases they are more on the junior side, meaning they are not involved that much in the entire project but only in the coordination between linguists, and in some cases, they are senior project managers who have more interests in the project outcome because they need to manage them from beginning to end and they also need to deal with their clients. In some cases, depending also on the client and the size of the company, the project manager can be the owner or the president of the organization at the same time. As I mainly had pleasant experiences with these stakeholders, I am going to explain how I learned to deal with them in general going into de