Help - Search - Member List - Calendar
Full Version: Regular Expressions
WorkTheWeb Forums > Webmaster Resources > Webmaster - General Help
Support our Sponsors!
Dylan Parry
Hi folks,

Does anyone know of quick way to match [A-Za-z0-9] and *all* letters
with accents, eg. and ?

Thanks,

--
Dylan Parry
http://webpageworkshop.co.uk -- FREE Web tutorials and references

Norman L. DeForest
On Tue, 28 Jun 2005, Dylan Parry wrote:

QUOTE
Hi folks,

Does anyone know of quick way to match [A-Za-z0-9] and *all* letters
with accents, eg. and ?

Thanks,

What character set? What encoding?

The expressions needed for ISO-8859-1 would be a lot different than the
expressions needed for ISO-8859-10 and a completely different (and
extremely large) expression would be needed for UTF-8.

Finding '' ('O'-accute in ISO-8859-1) may be easy if the encoding is
plain text but (depending on context and encoding) you may also need to
search for '%D3' or =D3 or 'Ó' or 'Ó' or 'Ó' or 'Ó'
(hex C3,93 == UTF-8 encoding for D3).

Would you want to include non-Latin letters such as Cyrillic or Hebrew or
Arabic or Greek or Chinese letters or just accented *Latin* characters?
My Unicode charts also include a number of alternate forms for digits
such as the Arabic digits '۰' to '۹'. Would you want them
included as digits?

This should match all digits and letters in the ISO-8859-1 character set
unless they are escaped somehow ('%xx', '=xx', '&#nn;', etc.):
[0-9A-Za-z---]
and for CP1252, this would include a few more accented characters that are
control codes in ISO-8859-1:
[0-9A-Za-z---]

Depending on the software you are using, you may have to substitute
"ddd" for each high character with 'ddd' being the octal value of the
character or 'xdd' with 'dd' being the hexadecimal value.

--
Windows is *not* a "Toy OS".
/me desperately trying to hide the URL for the screenshot of my desktop
http://www.chebucto.ns.ca/~af380/temp/MyDe...Jun-22-2005.gif

John Bokma
Dylan Parry <[Email Removed]> wrote:

QUOTE
Hi folks,

Does anyone know of quick way to match [A-Za-z0-9] and *all* letters
with accents, eg. and ?

Which language?

--
John Perl SEO tools: http://johnbokma.com/perl/
Experienced (web) developer: http://castleamber.com/
Get a SEO report of your site for just 100 USD:
http://johnbokma.com/websitedesign/seo-expert-help.html

Toby Inkster
Dylan Parry wrote:

QUOTE
Does anyone know of quick way to match [A-Za-z0-9] and *all* letters
with accents, eg. ü and é?

In what language and what character set? Perl has quite good built-in
support for Unicode, so if you're using Perl and Unicode, should be as
simple as "w". See "man perlunicode" for details.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

Dylan Parry
Using a pointed stick and pebbles, John Bokma scraped:

QUOTE
Does anyone know of quick way to match [A-Za-z0-9] and *all* letters
with accents, eg. and ?

Which language?

PHP. I've settled, for now, on a string of acceptable foreign
characters, but the result is a bloody ugly looking regexp!

--
Dylan Parry
http://webpageworkshop.co.uk -- FREE Web tutorials and references


PHP Help | Linux Help | Web Hosting | Reseller Hosting | SSL Hosting
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2006 Invision Power Services, Inc.