Richie's Blog: May 2009

Friday, May 08, 2009

Tomcat Distilled - Tips and Skills

# Resolve Fatal Error: java.lang.OutOfMemoryError: PermGen space
JAVA_OPTS="-server -XX:PermSize=256M -XX:MaxPermSize=512M"

Thursday, May 07, 2009

MySQL Distilled - A practical introduction

1. MySQL dump
mysqldump -u root -p arrayweb > arrayweb.sql
2. MySQL dump without data
mysqldump -u root -p -d arrayweb > arrayweb.sql

Monday, May 04, 2009

The Regular Expression Introduction

Summary of regular-expression constructs

Construct	Matches

Characters
x	The character x
`\\`	The backslash character
`\0`n	The character with octal value `0`n (0 `<=` n `<=` 7)
`\0`nn	The character with octal value `0`nn (0 `<=` n `<=` 7)
`\0`mnn	The character with octal value `0`mnn (0 `<=` m `<=` 3, 0 `<=` n `<=` 7)
`\x`hh	The character with hexadecimal value `0x`hh
`\u`hhhh	The character with hexadecimal value `0x`hhhh
`\t`	The tab character (`'\u0009'`)
`\n`	The newline (line feed) character (`'\u000A'`)
`\r`	The carriage-return character (`'\u000D'`)
`\f`	The form-feed character (`'\u000C'`)
`\a`	The alert (bell) character (`'\u0007'`)
`\e`	The escape character (`'\u001B'`)
`\c`x	The control character corresponding to x

Character classes
`[abc]`	`a`, `b`, or `c` (simple class)
`[^abc]`	Any character except `a`, `b`, or `c` (negation)
`[a-zA-Z]`	`a` through `z` or `A` through `Z`, inclusive (range)
`[a-d[m-p]]`	`a` through `d`, or `m` through `p`: `[a-dm-p]` (union)
`[a-z&&[def]]`	`d`, `e`, or `f` (intersection)
`[a-z&&[^bc]]`	`a` through `z`, except for `b` and `c`: `[ad-z]` (subtraction)
`[a-z&&[^m-p]]`	`a` through `z`, and not `m` through `p`: `[a-lq-z]`(subtraction)

Predefined character classes
`.`	Any character (may or may not match line terminators)
`\d`	A digit: `[0-9]`
`\D`	A non-digit: `[^0-9]`
`\s`	A whitespace character: `[ \t\n\x0B\f\r]`
`\S`	A non-whitespace character: `[^\s]`
`\w`	A word character: `[a-zA-Z_0-9]`
`\W`	A non-word character: `[^\w]`

POSIX character classes (US-ASCII only)
`\p{Lower}`	A lower-case alphabetic character: `[a-z]`
`\p{Upper}`	An upper-case alphabetic character:`[A-Z]`
`\p{ASCII}`	All ASCII:`[\x00-\x7F]`
`\p{Alpha}`	An alphabetic character:`[\p{Lower}\p{Upper}]`
`\p{Digit}`	A decimal digit: `[0-9]`
`\p{Alnum}`	An alphanumeric character:`[\p{Alpha}\p{Digit}]`
`\p{Punct}`	Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{\|}~
`\p{Graph}`	A visible character: `[\p{Alnum}\p{Punct}]`
`\p{Print}`	A printable character: `[\p{Graph}\x20]`
`\p{Blank}`	A space or a tab: `[ \t]`
`\p{Cntrl}`	A control character: `[\x00-\x1F\x7F]`
`\p{XDigit}`	A hexadecimal digit: `[0-9a-fA-F]`
`\p{Space}`	A whitespace character: `[ \t\n\x0B\f\r]`

java.lang.Character classes (simple java character type)
`\p{javaLowerCase}`	Equivalent to java.lang.Character.isLowerCase()
`\p{javaUpperCase}`	Equivalent to java.lang.Character.isUpperCase()
`\p{javaWhitespace}`	Equivalent to java.lang.Character.isWhitespace()
`\p{javaMirrored}`	Equivalent to java.lang.Character.isMirrored()

Classes for Unicode blocks and categories
`\p{InGreek}`	A character in the Greek block (simple block)
`\p{Lu}`	An uppercase letter (simple category)
`\p{Sc}`	A currency symbol
`\P{InGreek}`	Any character except one in the Greek block (negation)
`[\p{L}&&[^\p{Lu}]]`	Any letter except an uppercase letter (subtraction)

Boundary matchers
`^`	The beginning of a line
`$`	The end of a line
`\b`	A word boundary
`\B`	A non-word boundary
`\A`	The beginning of the input
`\G`	The end of the previous match
`\Z`	The end of the input but for the final terminator, if any
`\z`	The end of the input

Greedy quantifiers
X`?`	X, once or not at all
X`*`	X, zero or more times
X`+`	X, one or more times
X`{`n`}`	X, exactly n times
X`{`n`,}`	X, at least n times
X`{`n`,`m`}`	X, at least n but not more than m times

Reluctant quantifiers
X`??`	X, once or not at all
X`*?`	X, zero or more times
X`+?`	X, one or more times
X`{`n`}?`	X, exactly n times
X`{`n`,}?`	X, at least n times
X`{`n`,`m`}?`	X, at least n but not more than m times

Possessive quantifiers
X`?+`	X, once or not at all
X`*+`	X, zero or more times
X`++`	X, one or more times
X`{`n`}+`	X, exactly n times
X`{`n`,}+`	X, at least n times
X`{`n`,`m`}+`	X, at least n but not more than m times

Logical operators
XY	X followed by Y
X`\|`Y	Either X or Y
`(`X`)`	X, as a capturing group

Back references
`\`n	Whatever the n^th href="#cg">capturing group matched

Quotation
`\`	Nothing, but quotes the following character
`\Q`	Nothing, but quotes all characters until `\E`
`\E`	Nothing, but ends quoting started by `\Q`

Special constructs (non-capturing)
`(?:`X`)`	X, as a non-capturing group
`(?idmsux-idmsux)`	Nothing, but turns match flags i href="#UNIX_LINES">d m s href="#UNICODE_CASE">u x on - off
`(?idmsux-idmsux:`X`)`	X, as a non-capturing group with the given flags i d m s u x on - off
`(?=`X`)`	X, via zero-width positive lookahead
`(?!`X`)`	X, via zero-width negative lookahead
`(?<=`X`)`	X, via zero-width positive lookbehind
`(?X)`	X, via zero-width negative lookbehind
`(?>`X`)`	X, as an independent, non-capturing group

Match and regex modes
Pattern.UNIX_LINES - (?d)	Changes how dot and ^ match
Pattern.DOTALL - (?s)	Causes dot to match any character
Pattern.MULTILINE - (?m)	Expands where ^ and $ can match
Pattern.COMMENTS - (?x)	Free-spacing and comment mode (Applies even inside character classes)
Pattern.CASE_INSENSITIVE - (?i)	Case-insensitive matching for ASCII characters
Pattern.UNICODE_CASE - (?u)	Case-insensitive matching for non-ASCII characters
Pattern.CANON_EQ	Unicode "canonical equivalence" match mode (different encodings of the same character match as identical)
Pattern.LITERAL	Treat the regex argument as plain, literal text instead of as a regular expression

Usage demos and examples

<br /><br />package sa.cdc.svn.service.repos;<br /><br />import java.io.BufferedReader;<br />import java.io.IOException;<br />import java.io.InputStream;<br />import java.io.InputStreamReader;<br />import java.util.regex.Matcher;<br />import java.util.regex.Pattern;<br /><br /><br />public class RegularExpression {<br /> /* Simple Regex Test */<br /> public void simpleRegexTest() {<br />  String regex = "\\d+\\w+";<br />  String input = "This is my 1st test string, soon will the 2nd come.";<br /><br />  // match like [groups]<br />  regex = "\\[([^\\[]*)\\]";<br />  input = "[groups][aliases][authzPath]";<br /><br />  // match number except 3,4,5<br />  regex = "[0-9&&[^345]]";<br />  input = "6";<br /><br />  regex = "a{3,6}";<br />  input = "aaaaaaaaa";<br /><br />  regex = "(dog){3}";<br />  input = "dogdogdogdogdog";<br /><br />  regex = "[abc]{3}";<br />  input = "abccabaaaccbbbc";<br /><br />  // Reluctant quanlifiers<br />  regex = ".*?foo";<br />  input = "xfooxxxxxxfoo";<br /><br />  // Refer to group index<br />  regex = "(\\d\\d)\\1";<br />  input = "1212";<br /><br />  // Start with dog<br />  regex = "^dog\\w*";<br />  input = "dogblahblah";<br /><br />  // A word boundary<br />  regex = "\\bdog\\b";<br />  input = "The dog plays in the yard.";<br /><br />  // A non-word boundary<br />  regex = "\\bdog\\B";<br />  input = "The doggie plays in the yard.";<br /><br />  // The end of the previous match<br />  regex = "\\Gdog";<br />  input = "dogdog dog";<br /><br />  // Need to set Pattern.CASE_INSENSITIVE;<br />  regex = "dog";<br />  input = "DoGDOg";<br /><br />  // (?i) means case insensitive<br />  regex = "(?i)dog";<br />  input = "DoGDOg";<br /><br />  regex = "foo";<br />  input = "fooooooooooooooooo";<br /><br />  regex = "a*b";<br />  input = "aabfooaabfooabfoob";<br /><br />  // match email address<br />  regex = "\\w+([-+.]\\w+)*@\\w+([-.]\\w+)*\\.\\w+([-.]\\w+)*";<br />  input = "as_bc@sie.com";<br /><br />  // match a url<br />  regex = "^[a-zA-z]+://(\\w+(-\\w+)*)(\\.(\\w+(-\\w+)*))*(\\?\\S*)?$";<br />  input = "http://abc.doe?";<br /><br />  // match a word with only digital and 26 letters<br />  regex = "^[A-Za-z0-9]+$"; // "^w+$"<br />  input = "123abc3sdf323";<br /><br />  // match a chinese id<br />  regex = "\\d{15}|\\d{18}";<br />  input = "44010484646354875834";<br /><br />  // match a chinese local phone<br />  regex = "\\d{3}-\\d{8}|\\d{4}-\\d{7}";<br />  input = "0319-8473645";<br /><br />  // match a chinese ip<br />  regex = "\\d+\\.\\d+\\.\\d+\\.\\d+";<br />  input = "61.144.43.235";<br /><br />  // match an integer<br />  regex = "^-?[1-9]\\d*|0$";<br />  input = "0";<br /><br />  // match an <a></a><br />  regex = "<(\\S*?)[^>]*>.*?</\1>|<.*?/>";<br />  input = "<abc>delphi<abc/>";<br /><br />  // match whitespace before or after a line<br />  regex = "^\\s*|\\s*$";<br />  input = "<abc>delphi<abc/> ";<br /><br />  // match a QQ number<br />  regex = "[1-9][0-9]{4,}";<br />  input = "8646354";<br /><br />  // match a date<br />  regex = "^(\\d{2}|\\d{4})-((0([1-9]{1}))|(1[1|2]))-(([0-2]([1-9]{1}))|(3[0|1]))$";<br />  input = "89-02-12";<br /><br />  // match chinese words<br />  regex = "[\u4e00-\u9fa5]";<br />  input = "志气";<br /><br />  // match unicode (two byte) character<br />  // String.prototype.len=function(){return this.replace([^x00-xff]/g,"aa").length;}<br />  regex = "[^\\x00-\\xff]";<br />  input = "志气";<br /><br />  // match empty line<br />  regex = "\\n\\s*\\r";<br />  input = "\n\r";<br /><br />  // match a float<br />  regex = "^(-?\\d+)(\\.\\d+)?$";<br />  input = "-123.23";<br /><br />  // match a date<br />  regex = "^(\\d{2}|\\d{4})-((0([1-9]{1}))|(1[1|2]))-(([0-2]([1-9]{1}))|(3[0|1]))$";<br />  input = "1989-02-12";<br /><br />  Pattern pattern = Pattern.compile(regex);<br />  Matcher matcher = pattern.matcher(input);<br />  boolean found = false;<br />  while (matcher.find()) {<br />   System.out.println("Found the text \"" + matcher.group() + "\", start at "<br />     + matcher.start() + ", end at " + matcher.end());<br />   found = true;<br />  }<br />  if (!found) {<br />   System.out.println("No match found.");<br />  }<br /> }<br /><br /> /* Parse A Structured File/Log */<br /> public void parseAuthzFile() {<br />  try {<br />   InputStream stream = getClass().getResourceAsStream("authz");<br />   BufferedReader reader = new BufferedReader(new InputStreamReader(stream));<br /><br />   StringBuilder authz = new StringBuilder();<br />   String line = null;<br />   while ((line = reader.readLine()) != null) {<br />    authz.append(line);<br />    authz.append('\n');<br />   }<br /><br />   // begins with [ and ends with ]<br />   String regex = "^\\[([^\\[]*)\\]$";<br />   String input = authz.toString();<br /><br />   Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);<br />   Matcher matcher = pattern.matcher(input);<br /><br />   int location = 0;<br />   boolean found = false;<br />   // add global comments of the authz file<br />   if (matcher.find()) {<br />    System.out.println(authz.substring(location, matcher.start()));<br />    location = matcher.start();<br />    found = true;<br />   }<br />   // add each segment<br />   String segment = null;<br />   while (matcher.find()) {<br />    segment = authz.substring(location, matcher.start());<br />    location = matcher.start();<br />    System.out.print(segment);<br />    System.out.println("segment:" + matcher.group(1));<br />   }<br />   // then last segment<br />   if (found) {<br />    segment = authz.substring(location);<br />    System.out.print(segment);<br />   }<br />  } catch (IOException e) {<br />   e.printStackTrace();<br />  }<br /> }<br /><br /> public void splitInput() {<br />  Pattern pattern = Pattern.compile("\\d");<br />  String input = "one9two4three7four1five";<br />  String[] items = pattern.split(input);<br />  for (String item : items) {<br />   System.out.println(item);<br />  }<br /> }<br /><br /> public void identifyURL() {<br />  String url = "https://regex.info:8080/blog/article.do?id=123";<br />  String regex = "(?x) ^(https?):// ([^/:]+) (:(\\d+))? (.*)";<br />  Matcher m = Pattern.compile(regex).matcher(url);<br /><br />  if (m.matches()) {<br />   System.out.print("Overall  [" + m.group() + "]" + " (from " + m.start() + " to "<br />     + m.end() + ")\n" + "Protocol [" + m.group(1) + "]" + " (from " + m.start(1)<br />     + " to " + m.end(1) + ")\n" + "Hostname [" + m.group(2) + "]" + " (from "<br />     + m.start(2) + " to " + m.end(2) + ")\n");<br />   // Group #3 might not have participated, so we must be careful here<br />   if (m.group(3) == null)<br />    System.out.println("No port; default of '80' is assumed");<br />   else {<br />    System.out.print("Port is [" + m.group(4) + "] " + "(from " + m.start(4) + " to "<br />      + m.end(4) + ")\n");<br />   }<br />   // Group #5 might also not have participated<br />   if (m.group(5) == null) {<br />    System.out.println("No path specified");<br />   } else {<br />    System.out.println("Path is [" + m.group(5) + "] " + "(from " + m.start(5) + " to "<br />      + m.end(5) + ")\n");<br />   }<br />  }<br /> }<br /><br /> public void searchAndReplace() {<br />  String regex = "\\bJava\\s*1\\.5\\b";<br />  String input = "Before Java 1.5 was Java 1.4.2. After Java 1.5 is Java 1.6";<br />  Matcher matcher = Pattern.compile(regex).matcher(input);<br /><br />  String result = matcher.replaceAll("Java 5.0");<br />  System.out.println("Replace all: " + result);<br /><br />  matcher.reset();<br />  result = matcher.replaceFirst("Java 5.0");<br />  System.out.println("Replace first: " + result);<br /><br />  matcher.reset();<br />  // You can convert "Java 1.6" to "Java 6.0" as well.<br />  result = Pattern.compile("\\bJava\\s*1\\.([56])\\b").matcher(input).replaceAll("Java $1.0");<br />  // $1\2 means the replace text will be followed by 2<br />  // $12 means the group(12) is the replacement text<br />  System.out.println("Argument replace: " + result);<br /><br />  matcher.reset();<br />  // Use wierd replacement text correctly<br />  result = matcher.replaceAll(Matcher.quoteReplacement("Java \\. $2 5.0"));<br />  System.out.println("Quote replacement: " + result);<br /><br />  matcher.reset();<br />  StringBuffer sb = new StringBuffer();<br />  while (matcher.find()) {<br />   matcher.appendReplacement(sb, "Java 5.0");<br />   System.out.println("Append replacement: " + sb.toString());<br />  }<br />  matcher.appendTail(sb);<br />  System.out.println("Append replacement: " + sb.toString());<br /><br />  // Convert Celsius temperatures to Fahrenheit<br />  input = "from 36.3C to 40.1C.";<br />  // ?: means non-capturing group, here the group count is actually 1<br />  matcher = Pattern.compile("(\\d+(?:\\.\\d*)?)C\\b").matcher(input);<br />  sb = new StringBuffer();<br />  while (matcher.find()) {<br />   float celsius = Float.parseFloat(matcher.group(1));<br />   int fahrenheit = (int) (celsius * 9 / 5 + 32);<br />   matcher.appendReplacement(sb, fahrenheit + "F");<br />  }<br />  matcher.appendTail(sb);<br />  System.out.println("Customized replacement: " + sb.toString());<br /><br />  // In-Place Replacement<br />  StringBuilder text = new StringBuilder("It's SO VERY RUDE to shout!");<br />  matcher = Pattern.compile("\\b[\\p{Lu}\\p{Lt}]+\\b").matcher(text);<br />  int matchPointer = 0;<br />  while (matcher.find(matchPointer)) {<br />   matchPointer = matcher.end();<br />   text.replace(matcher.start(), matcher.end(), "<b>" + matcher.group().toLowerCase()<br />     + "</b>");<br />   matchPointer += 7; // Account for having added '<b>' and '</b>'<br />  }<br />  System.out.println("In-place replacement1: " + text);<br /><br />  matcher.reset();<br />  sb = new StringBuffer();<br />  while (matcher.find()) {<br />   matcher.appendReplacement(sb, "<b>" + matcher.group().toLowerCase() + "</b>");<br />  }<br />  matcher.appendTail(sb);<br />  System.out.println("In-place replacement2: " + sb.toString());<br /><br />  // Transparent bounds<br />  regex = "\\bcar\\b";<br />  input = "Madagascar is best seen by car or bike.";<br />  matcher = Pattern.compile(regex).matcher(input);<br />  matcher.useAnchoringBounds(false);<br />  matcher.useTransparentBounds(true); // try to set false to see difference<br />  matcher.region(7, input.length());<br />  matcher.find();<br />  System.out.println("Matches starting at character " + matcher.start());<br /><br />  // The matcher's region<br />  // Matcher to find an image tag in html content<br />  String html = "a fragment of html text";<br />  // Matcher to find an image tag. The 'html' variable contains the HTML in question<br />  Matcher mImg = Pattern.compile("(?id)<IMG\\s+(.*?)/? />").matcher(html);<br />  // Matcher to find an ALT attribute (to be applied to an IMG tag's body within the same<br />  // 'html' variable)<br />  Matcher mAlt = Pattern.compile("(?ix)\\b ALT \\s* =").matcher(html);<br />  // Matcher to find a newline<br />  Matcher mLine = Pattern.compile("\\n").matcher(html);<br /><br />  // For each image tag within the html ...<br />  while (mImg.find()) {<br />   // Restrict the next ALT search to the body of the just-found image tag<br />   mAlt.region(mImg.start(1), mImg.end(1));<br />   // Report an error if no ALT found, showing the whole image tag found above<br />   if (!mAlt.find()) {<br />    // Restrict counting of newlines to the text before the start of the image tag<br />    mLine.region(0, mImg.start());<br />    int lineNum = 1; // The first line is numbered 1<br />    while (mLine.find())<br />     lineNum++; // Each newline bumps up the line number<br />    System.out.println("Missing ALT attribute on line " + lineNum);<br />   } else {<br />    System.out.println("Found ALT attribute, start at " + mAlt.start() + ", end at "<br />      + mAlt.end());<br />   }<br />  }<br /><br /> }<br /><br /> public static void main(String[] args) {<br />  RegularExpression regex = new RegularExpression();<br />  regex.simpleRegexTest();<br />  // regex.parseAuthzFile();<br />  // regex.splitInput();<br />  // regex.identifyURL();<br />  regex.searchAndReplace();<br /> }<br />}<br /><br />

Sunday, May 03, 2009

The Regular Expression Introduction

Summary of regular-expression constructs

Construct	Matches

Characters
x	The character x
`\\`	The backslash character
`\0`n	The character with octal value `0`n (0 `<=` n `<=` 7)
`\0`nn	The character with octal value `0`nn (0 `<=` n `<=` 7)
`\0`mnn	The character with octal value `0`mnn (0 `<=` m `<=` 3, 0 `<=` n `<=` 7)
`\x`hh	The character with hexadecimal value `0x`hh
`\u`hhhh	The character with hexadecimal value `0x`hhhh
`\t`	The tab character (`'\u0009'`)
`\n`	The newline (line feed) character (`'\u000A'`)
`\r`	The carriage-return character (`'\u000D'`)
`\f`	The form-feed character (`'\u000C'`)
`\a`	The alert (bell) character (`'\u0007'`)
`\e`	The escape character (`'\u001B'`)
`\c`x	The control character corresponding to x

Character classes
`[abc]`	`a`, `b`, or `c` (simple class)
`[^abc]`	Any character except `a`, `b`, or `c` (negation)
`[a-zA-Z]`	`a` through `z` or `A` through `Z`, inclusive (range)
`[a-d[m-p]]`	`a` through `d`, or `m` through `p`: `[a-dm-p]` (union)
`[a-z&&[def]]`	`d`, `e`, or `f` (intersection)
`[a-z&&[^bc]]`	`a` through `z`, except for `b` and `c`: `[ad-z]` (subtraction)
`[a-z&&[^m-p]]`	`a` through `z`, and not `m` through `p`: `[a-lq-z]`(subtraction)

Predefined character classes
`.`	Any character (may or may not match line terminators)
`\d`	A digit: `[0-9]`
`\D`	A non-digit: `[^0-9]`
`\s`	A whitespace character: `[ \t\n\x0B\f\r]`
`\S`	A non-whitespace character: `[^\s]`
`\w`	A word character: `[a-zA-Z_0-9]`
`\W`	A non-word character: `[^\w]`

POSIX character classes (US-ASCII only)
`\p{Lower}`	A lower-case alphabetic character: `[a-z]`
`\p{Upper}`	An upper-case alphabetic character:`[A-Z]`
`\p{ASCII}`	All ASCII:`[\x00-\x7F]`
`\p{Alpha}`	An alphabetic character:`[\p{Lower}\p{Upper}]`
`\p{Digit}`	A decimal digit: `[0-9]`
`\p{Alnum}`	An alphanumeric character:`[\p{Alpha}\p{Digit}]`
`\p{Punct}`	Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{\|}~
`\p{Graph}`	A visible character: `[\p{Alnum}\p{Punct}]`
`\p{Print}`	A printable character: `[\p{Graph}\x20]`
`\p{Blank}`	A space or a tab: `[ \t]`
`\p{Cntrl}`	A control character: `[\x00-\x1F\x7F]`
`\p{XDigit}`	A hexadecimal digit: `[0-9a-fA-F]`
`\p{Space}`	A whitespace character: `[ \t\n\x0B\f\r]`

java.lang.Character classes (simple java character type)
`\p{javaLowerCase}`	Equivalent to java.lang.Character.isLowerCase()
`\p{javaUpperCase}`	Equivalent to java.lang.Character.isUpperCase()
`\p{javaWhitespace}`	Equivalent to java.lang.Character.isWhitespace()
`\p{javaMirrored}`	Equivalent to java.lang.Character.isMirrored()

Classes for Unicode blocks and categories
`\p{InGreek}`	A character in the Greek block (simple block)
`\p{Lu}`	An uppercase letter (simple category)
`\p{Sc}`	A currency symbol
`\P{InGreek}`	Any character except one in the Greek block (negation)
`[\p{L}&&[^\p{Lu}]]`	Any letter except an uppercase letter (subtraction)

Boundary matchers
`^`	The beginning of a line
`$`	The end of a line
`\b`	A word boundary
`\B`	A non-word boundary
`\A`	The beginning of the input
`\G`	The end of the previous match
`\Z`	The end of the input but for the final terminator, if any
`\z`	The end of the input

Greedy quantifiers
X`?`	X, once or not at all
X`*`	X, zero or more times
X`+`	X, one or more times
X`{`n`}`	X, exactly n times
X`{`n`,}`	X, at least n times
X`{`n`,`m`}`	X, at least n but not more than m times

Reluctant quantifiers
X`??`	X, once or not at all
X`*?`	X, zero or more times
X`+?`	X, one or more times
X`{`n`}?`	X, exactly n times
X`{`n`,}?`	X, at least n times
X`{`n`,`m`}?`	X, at least n but not more than m times

Possessive quantifiers
X`?+`	X, once or not at all
X`*+`	X, zero or more times
X`++`	X, one or more times
X`{`n`}+`	X, exactly n times
X`{`n`,}+`	X, at least n times
X`{`n`,`m`}+`	X, at least n but not more than m times

Logical operators
XY	X followed by Y
X`\|`Y	Either X or Y
`(`X`)`	X, as a capturing group

Back references
`\`n	Whatever the n^th href="#cg">capturing group matched

Quotation
`\`	Nothing, but quotes the following character
`\Q`	Nothing, but quotes all characters until `\E`
`\E`	Nothing, but ends quoting started by `\Q`

Special constructs (non-capturing)
`(?:`X`)`	X, as a non-capturing group
`(?idmsux-idmsux)`	Nothing, but turns match flags i href="#UNIX_LINES">d m s href="#UNICODE_CASE">u x on - off
`(?idmsux-idmsux:`X`)`	X, as a non-capturing group with the given flags i d m s u x on - off
`(?=`X`)`	X, via zero-width positive lookahead
`(?!`X`)`	X, via zero-width negative lookahead
`(?<=`X`)`	X, via zero-width positive lookbehind
`(?<!`X`)`	X, via zero-width negative lookbehind
`(?>`X`)`	X, as an independent, non-capturing group

Match and regex modes
Pattern.UNIX_LINES - (?d)	Changes how dot and ^ match
Pattern.DOTALL - (?s)	Causes dot to match any character
Pattern.MULTILINE - (?m)	Expands where ^ and $ can match
Pattern.COMMENTS - (?x)	Free-spacing and comment mode (Applies even inside character classes)
Pattern.CASE_INSENSITIVE - (?i)	Case-insensitive matching for ASCII characters
Pattern.UNICODE_CASE - (?u)	Case-insensitive matching for non-ASCII characters
Pattern.CANON_EQ	Unicode "canonical equivalence" match mode (different encodings of the same character match as identical)
Pattern.LITERAL	Treat the regex argument as plain, literal text instead of as a regular expression

Usage demos and examples



package sa.cdc.svn.service.repos;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern;


public class RegularExpression {
 /* Simple Regex Test */
 public void simpleRegexTest() {
  String regex = "\\d+\\w+";
  String input = "This is my 1st test string, soon will the 2nd come.";

  // match like [groups]
  regex = "\\[([^\\[]*)\\]";
  input = "[groups][aliases][authzPath]";

  // match number except 3,4,5
  regex = "[0-9&&[^345]]";
  input = "6";

  regex = "a{3,6}";
  input = "aaaaaaaaa";

  regex = "(dog){3}";
  input = "dogdogdogdogdog";

  regex = "[abc]{3}";
  input = "abccabaaaccbbbc";

  // Reluctant quanlifiers
  regex = ".*?foo";
  input = "xfooxxxxxxfoo";

  // Refer to group index
  regex = "(\\d\\d)\\1";
  input = "1212";

  // Start with dog
  regex = "^dog\\w*";
  input = "dogblahblah";

  // A word boundary
  regex = "\\bdog\\b";
  input = "The dog plays in the yard.";

  // A non-word boundary
  regex = "\\bdog\\B";
  input = "The doggie plays in the yard.";

  // The end of the previous match
  regex = "\\Gdog";
  input = "dogdog dog";

  // Need to set Pattern.CASE_INSENSITIVE;
  regex = "dog";
  input = "DoGDOg";

  // (?i) means case insensitive
  regex = "(?i)dog";
  input = "DoGDOg";

  regex = "foo";
  input = "fooooooooooooooooo";

  regex = "a*b";
  input = "aabfooaabfooabfoob";

  // match email address
  regex = "\\w+([-+.]\\w+)*@\\w+([-.]\\w+)*\\.\\w+([-.]\\w+)*";
  input = "as_bc@sie.com";

  // match a url
  regex = "^[a-zA-z]+://(\\w+(-\\w+)*)(\\.(\\w+(-\\w+)*))*(\\?\\S*)?$";
  input = "http://abc.doe?";

  // match a word with only digital and 26 letters
  regex = "^[A-Za-z0-9]+$"; // "^w+$"
  input = "123abc3sdf323";

  // match a chinese id
  regex = "\\d{15}|\\d{18}";
  input = "44010484646354875834";

  // match a chinese local phone
  regex = "\\d{3}-\\d{8}|\\d{4}-\\d{7}";
  input = "0319-8473645";

  // match a chinese ip
  regex = "\\d+\\.\\d+\\.\\d+\\.\\d+";
  input = "61.144.43.235";

  // match an integer
  regex = "^-?[1-9]\\d*|0$";
  input = "0";

  // match an 
  regex = "<(\\S*?)[^>]*>.*?|<.*?/>";
  input = "delphi";

  // match whitespace before or after a line
  regex = "^\\s*|\\s*$";
  input = "delphi ";

  // match a QQ number
  regex = "[1-9][0-9]{4,}";
  input = "8646354";

  // match a date
  regex = "^(\\d{2}|\\d{4})-((0([1-9]{1}))|(1[1|2]))-(([0-2]([1-9]{1}))|(3[0|1]))$";
  input = "89-02-12";

  // match chinese words
  regex = "[\u4e00-\u9fa5]";
  input = "志气";

  // match unicode (two byte) character
  // String.prototype.len=function(){return this.replace([^x00-xff]/g,"aa").length;}
  regex = "[^\\x00-\\xff]";
  input = "志气";

  // match empty line
  regex = "\\n\\s*\\r";
  input = "\n\r";

  // match a float
  regex = "^(-?\\d+)(\\.\\d+)?$";
  input = "-123.23";

  // match a date
  regex = "^(\\d{2}|\\d{4})-((0([1-9]{1}))|(1[1|2]))-(([0-2]([1-9]{1}))|(3[0|1]))$";
  input = "1989-02-12";

  Pattern pattern = Pattern.compile(regex);
  Matcher matcher = pattern.matcher(input);
  boolean found = false;
  while (matcher.find()) {
   System.out.println("Found the text \"" + matcher.group() + "\", start at "
     + matcher.start() + ", end at " + matcher.end());
   found = true;
  }
  if (!found) {
   System.out.println("No match found.");
  }
 }

 /* Parse A Structured File/Log */
 public void parseAuthzFile() {
  try {
   InputStream stream = getClass().getResourceAsStream("authz");
   BufferedReader reader = new BufferedReader(new InputStreamReader(stream));

   StringBuilder authz = new StringBuilder();
   String line = null;
   while ((line = reader.readLine()) != null) {
    authz.append(line);
    authz.append('\n');
   }

   // begins with [ and ends with ]
   String regex = "^\\[([^\\[]*)\\]$";
   String input = authz.toString();

   Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
   Matcher matcher = pattern.matcher(input);

   int location = 0;
   boolean found = false;
   // add global comments of the authz file
   if (matcher.find()) {
    System.out.println(authz.substring(location, matcher.start()));
    location = matcher.start();
    found = true;
   }
   // add each segment
   String segment = null;
   while (matcher.find()) {
    segment = authz.substring(location, matcher.start());
    location = matcher.start();
    System.out.print(segment);
    System.out.println("segment:" + matcher.group(1));
   }
   // then last segment
   if (found) {
    segment = authz.substring(location);
    System.out.print(segment);
   }
  } catch (IOException e) {
   e.printStackTrace();
  }
 }

 public void splitInput() {
  Pattern pattern = Pattern.compile("\\d");
  String input = "one9two4three7four1five";
  String[] items = pattern.split(input);
  for (String item : items) {
   System.out.println(item);
  }
 }

 public void identifyURL() {
  String url = "https://regex.info:8080/blog/article.do?id=123";
  String regex = "(?x) ^(https?):// ([^/:]+) (:(\\d+))? (.*)";
  Matcher m = Pattern.compile(regex).matcher(url);

  if (m.matches()) {
   System.out.print("Overall  [" + m.group() + "]" + " (from " + m.start() + " to "
     + m.end() + ")\n" + "Protocol [" + m.group(1) + "]" + " (from " + m.start(1)
     + " to " + m.end(1) + ")\n" + "Hostname [" + m.group(2) + "]" + " (from "
     + m.start(2) + " to " + m.end(2) + ")\n");
   // Group #3 might not have participated, so we must be careful here
   if (m.group(3) == null)
    System.out.println("No port; default of '80' is assumed");
   else {
    System.out.print("Port is [" + m.group(4) + "] " + "(from " + m.start(4) + " to "
      + m.end(4) + ")\n");
   }
   // Group #5 might also not have participated
   if (m.group(5) == null) {
    System.out.println("No path specified");
   } else {
    System.out.println("Path is [" + m.group(5) + "] " + "(from " + m.start(5) + " to "
      + m.end(5) + ")\n");
   }
  }
 }

 public void searchAndReplace() {
  String regex = "\\bJava\\s*1\\.5\\b";
  String input = "Before Java 1.5 was Java 1.4.2. After Java 1.5 is Java 1.6";
  Matcher matcher = Pattern.compile(regex).matcher(input);

  String result = matcher.replaceAll("Java 5.0");
  System.out.println("Replace all: " + result);

  matcher.reset();
  result = matcher.replaceFirst("Java 5.0");
  System.out.println("Replace first: " + result);

  matcher.reset();
  // You can convert "Java 1.6" to "Java 6.0" as well.
  result = Pattern.compile("\\bJava\\s*1\\.([56])\\b").matcher(input).replaceAll("Java $1.0");
  // $1\2 means the replace text will be followed by 2
  // $12 means the group(12) is the replacement text
  System.out.println("Argument replace: " + result);

  matcher.reset();
  // Use wierd replacement text correctly
  result = matcher.replaceAll(Matcher.quoteReplacement("Java \\. $2 5.0"));
  System.out.println("Quote replacement: " + result);

  matcher.reset();
  StringBuffer sb = new StringBuffer();
  while (matcher.find()) {
   matcher.appendReplacement(sb, "Java 5.0");
   System.out.println("Append replacement: " + sb.toString());
  }
  matcher.appendTail(sb);
  System.out.println("Append replacement: " + sb.toString());

  // Convert Celsius temperatures to Fahrenheit
  input = "from 36.3C to 40.1C.";
  // ?: means non-capturing group, here the group count is actually 1
  matcher = Pattern.compile("(\\d+(?:\\.\\d*)?)C\\b").matcher(input);
  sb = new StringBuffer();
  while (matcher.find()) {
   float celsius = Float.parseFloat(matcher.group(1));
   int fahrenheit = (int) (celsius * 9 / 5 + 32);
   matcher.appendReplacement(sb, fahrenheit + "F");
  }
  matcher.appendTail(sb);
  System.out.println("Customized replacement: " + sb.toString());

  // In-Place Replacement
  StringBuilder text = new StringBuilder("It's SO VERY RUDE to shout!");
  matcher = Pattern.compile("\\b[\\p{Lu}\\p{Lt}]+\\b").matcher(text);
  int matchPointer = 0;
  while (matcher.find(matchPointer)) {
   matchPointer = matcher.end();
   text.replace(matcher.start(), matcher.end(), "" + matcher.group().toLowerCase()
     + "");
   matchPointer += 7; // Account for having added '' and ''
  }
  System.out.println("In-place replacement1: " + text);

  matcher.reset();
  sb = new StringBuffer();
  while (matcher.find()) {
   matcher.appendReplacement(sb, "" + matcher.group().toLowerCase() + "");
  }
  matcher.appendTail(sb);
  System.out.println("In-place replacement2: " + sb.toString());

  // Transparent bounds
  regex = "\\bcar\\b";
  input = "Madagascar is best seen by car or bike.";
  matcher = Pattern.compile(regex).matcher(input);
  matcher.useAnchoringBounds(false);
  matcher.useTransparentBounds(true); // try to set false to see difference
  matcher.region(7, input.length());
  matcher.find();
  System.out.println("Matches starting at character " + matcher.start());

  // The matcher's region
  // Matcher to find an image tag in html content
  String html = "a fragment of html text";
  // Matcher to find an image tag. The 'html' variable contains the HTML in question
  Matcher mImg = Pattern.compile("(?id)").matcher(html);
  // Matcher to find an ALT attribute (to be applied to an IMG tag's body within the same
  // 'html' variable)
  Matcher mAlt = Pattern.compile("(?ix)\\b ALT \\s* =").matcher(html);
  // Matcher to find a newline
  Matcher mLine = Pattern.compile("\\n").matcher(html);

  // For each image tag within the html ...
  while (mImg.find()) {
   // Restrict the next ALT search to the body of the just-found image tag
   mAlt.region(mImg.start(1), mImg.end(1));
   // Report an error if no ALT found, showing the whole image tag found above
   if (!mAlt.find()) {
    // Restrict counting of newlines to the text before the start of the image tag
    mLine.region(0, mImg.start());
    int lineNum = 1; // The first line is numbered 1
    while (mLine.find())
     lineNum++; // Each newline bumps up the line number
    System.out.println("Missing ALT attribute on line " + lineNum);
   } else {
    System.out.println("Found ALT attribute, start at " + mAlt.start() + ", end at "
      + mAlt.end());
   }
  }

 }

 public static void main(String[] args) {
  RegularExpression regex = new RegularExpression();
  regex.simpleRegexTest();
  // regex.parseAuthzFile();
  // regex.splitInput();
  // regex.identifyURL();
  regex.searchAndReplace();
 }
}

Computer Technology English Collection 1

1. It provides powerful and innovative functionality with an uncluttered (if somewhat simplistic) API. 稍微有点儿
2. I understand that some readers interested only in Java may be inclined to start their reading with this chapter. 倾向于
3. Specify this flag may impose a slight performance penalty. 影响性能
4. To remedy that deficiency, here is a version of replaceAll that does respect the region. 弥补不足
5. We can actually work with text that we can modify directly, in place, and on the fly. 就地，即时，在运行中
6. We do this by using the start- and end- method data from the just-completed image-tag match to set the ALT-matcher's region, prior to invoking the ALT-matcher's find.

Saturday, May 02, 2009

Using Regular Expression to Parse the authz File

The Original authz file

### This file is an example authorization file for svnserve.
### Its format is identical to that of mod_authz_svn authorization
### files.
### As shown below each section defines authorizations for the path and
### (optional) repository specified by the section name.
### The authorizations follow. An authorization line can refer to:
### - a single user,
### - a group of users defined in a special [groups] section,
### - an alias defined in a special [aliases] section,
### - all authenticated users, using the '$authenticated' token,
### - only anonymous users, using the '$anonymous' token,
### - anyone, using the '*' wildcard.
###
### A match can be inverted by prefixing the rule with '~'. Rules can
### grant read ('r') access, read-write ('rw') access, or no access
### ('').

[aliases]
# joe = /C=XZ/ST=Dessert/L=Snake City/O=Snake Oil, Ltd./OU=Research Institute/CN=Joe Average

[groups]
# harry_and_sally = harry,sally
# harry_sally_and_joe = harry,sally,&joe

# [/foo/bar]
# harry = rw
# &joe = r
# * =

# [repository:/baz/fuz]
# @harry_and_sally = rw
# * = r

[groups]
Sales = wlin,jshi,zbai
Quality = wqiu,she,wzhang,juguo,lchen,chunmchen

[/]
* = rw
# This account will be used for the svn web client
svnwebclient = r

[/Sales]
@Sales = rw
* = r

[/Quality]
@Quality = rw
* = r

# This is the Sales' private folder
[/Sales/Private]
@Sales = rw
* =

# This is the Quality's private folder
[/Quality/Private]
@Quality = rw
* =

Using Regular Expression to Parse the authz File

    public Authz() {
        try {
            InputStream stream = getClass().getResourceAsStream("authz");
            BufferedReader reader = new BufferedReader(new InputStreamReader(stream));

            StringBuilder authz = new StringBuilder();
            String line = null;
            while ((line = reader.readLine()) != null) {
                authz.append(line);
                authz.append(Constant.ENTER_SIGN);
            }

            // begins with [ and ends with ]
            String regex = "^\\[([^\\[]*)\\]$";
            String input = authz.toString();

            Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
            Matcher matcher = pattern.matcher(input);

            int location = 0;
            boolean found = false;
            // add global comments of the authz file
            if (matcher.find()) {
                System.out.println(authz.substring(location, matcher.start()));
                location = matcher.start();
                found = true;
            }
            // add each segment
            String segment = null;
            while (matcher.find()) {
                segment = authz.substring(location, matcher.start());
                location = matcher.start();
                System.out.print(segment);
                System.out.println("segment:" + matcher.group(1));
            }
            // then last segment
            if (found) {
                segment = authz.substring(location);
                System.out.print(segment);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

Below is the output:

### This file is an example authorization file for svnserve.
### Its format is identical to that of mod_authz_svn authorization
### files.
### As shown below each section defines authorizations for the path and
### (optional) repository specified by the section name.
### The authorizations follow. An authorization line can refer to:
### - a single user,
### - a group of users defined in a special [groups] section,
### - an alias defined in a special [aliases] section,
### - all authenticated users, using the '$authenticated' token,
### - only anonymous users, using the '$anonymous' token,
### - anyone, using the '*' wildcard.
###
### A match can be inverted by prefixing the rule with '~'. Rules can
### grant read ('r') access, read-write ('rw') access, or no access
### ('').

[aliases]
# joe = /C=XZ/ST=Dessert/L=Snake City/O=Snake Oil, Ltd./OU=Research Institute/CN=Joe Average

segment:groups
[groups]
# harry_and_sally = harry,sally
# harry_sally_and_joe = harry,sally,&joe

# [/foo/bar]
# harry = rw
# &joe = r
# * =

# [repository:/baz/fuz]
# @harry_and_sally = rw
# * = r

segment:groups
[groups]
Sales = wlin,jshi,zbai
Quality = wqiu,she,wzhang,juguo,lchen,chunmchen

segment:/
[/]
* = rw
# This account will be used for the svn web client
svnwebclient = r

segment:/Sales
[/Sales]
@Sales = rw
* = r

segment:/Quality
[/Quality]
@Quality = rw
* = r

# This is the Sales' private folder
segment:/Sales/Private
[/Sales/Private]
@Sales = rw
* =

# This is the Quality's private folder
segment:/Quality/Private
[/Quality/Private]
@Quality = rw
* =

Using JSP/Servlet/Filter to Implement Login Authorization

login.jsp

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<%@ include file="/common/taglibs.jsp"%>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<%@ include file="/common/meta.jsp"%>
<title>Login - <m:message key="webapp.title" /></title>
<link rel="stylesheet" type="text/css" href="<c:url value='/styles/login.css'/>" />
<script type="text/javascript" src="<c:url value='/scripts/login.js'/>"></script>
</head>
<body>
<%@ include file="/common/header.jsp"%>
<div id="login">
<%@ include file="/common/status.jsp"%>
<div id="loginContent">
<form action="<c:url value='/login.do'></c:url>" method="post" onsubmit="return check()">
<div><input type="hidden" name="uri" value="${uri}" /></div>
<div><input type="hidden" name="repos" value="${repos}" /></div>
<div><m:message key="login.username" /><input type="text" class="commonInput" name="username" value="${username}" /></div>
<div><m:message key="login.password" /><input type="password" class="passwordInput" name="password" value="${password}" /></div>
<div id="submit"><input type="submit" class="buttonInput" value="<m:message key="button.login" />" /></div>
</form>
</div>
</div>
<%@ include file="/common/footer.jsp"%>
</body>
</html>

LoginServlet.java

package sa.cdc.svn.web.security;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import sa.cdc.svn.Constant;
import sa.cdc.svn.model.security.Certificate;
import sa.cdc.svn.service.server.ServerManager;
import sa.cdc.svn.web.BaseServlet;

public class LoginServlet extends BaseServlet {
    private static final long serialVersionUID = 8871588625655468698L;

    @Override
    protected void service(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {
        String loginPage = "/WEB-INF/pages/login.jsp";
        Certificate certificate = null;

        // redirect to this page after login
        String uri = request.getParameter("uri");
        uri = (uri == null) ? "/home.do" : uri;

        request.setAttribute("uriClick", uri);

        String repos = request.getParameter("repos");
        String username = request.getParameter("username");
        String password = request.getParameter("password");

        String ip = request.getRemoteAddr();

        // first try to use current certificate
        certificate = (Certificate) request.getSession().getAttribute(Constant.SESSION_CERTIFICATE);
        if (certificate == null) {
            if ((repos == null || repos.length() == 0) && isReposOperation(uri)) {
                // repository can't be empty
                request.setAttribute("uri", uri);
                request.setAttribute(Constant.STATUS, Constant.STATUS_ERROR);
                request.setAttribute(Constant.MESSAGE, "Please select a repository to login.");
                request.getRequestDispatcher("/home.do").forward(request, response);
            } else if (username == null || password == null) {
                // username or password can't be null
                request.setAttribute("uri", uri);
                request.setAttribute("repos", repos);
                request.getRequestDispatcher(loginPage).forward(request, response);
            } else if (username.length() == 0 || password.length() == 0) {
                // username or password can't be empty
                request.setAttribute("uri", uri);
                request.setAttribute("repos", repos);
                request.setAttribute(Constant.STATUS, Constant.STATUS_ERROR);
                request.setAttribute(Constant.MESSAGE, "Please input username and password.");
                request.getRequestDispatcher(loginPage).forward(request, response);
            } else {
                // go into authentication
                ServerManager serverManager = new ServerManager(repos);
                certificate = serverManager.validateUser(username, password);
                if (certificate != null) {
                    // succeeded to login
                    request.getSession().setAttribute(Constant.SESSION_CERTIFICATE, certificate);
                    request.setAttribute(Constant.STATUS, Constant.STATUS_OK);
                    response.sendRedirect(request.getContextPath() + uri);
                    logInfo(request, "Login from [" + ip + "]");
                } else {
                    // failed to login
                    request.setAttribute("uri", uri);
                    request.setAttribute("repos", repos);
                    request.setAttribute(Constant.STATUS, Constant.STATUS_WARNING);
                    request.setAttribute(Constant.MESSAGE, "Username or password is invalid.");
                    request.getRequestDispatcher(loginPage).forward(request, response);
                }
            }
        } else {
            ServerManager serverManager = new ServerManager(repos);
            username = certificate.getUsername();
            password = certificate.getPassword();
            certificate = serverManager.validateUser(username, password);
            if (certificate != null) {
                // succeeded to login
                request.getSession().setAttribute(Constant.SESSION_CERTIFICATE, certificate);
                //logInfo(request, "login from [" + ip + "]");
                request.setAttribute(Constant.STATUS, Constant.STATUS_OK);
                response.sendRedirect(request.getContextPath() + uri);
            } else {
                // failed to login
                // remove the current invalid certificate
                request.getSession().removeAttribute(Constant.SESSION_CERTIFICATE);
                request.setAttribute("uri", uri);
                request.setAttribute("repos", repos);
                request.setAttribute(Constant.STATUS, Constant.STATUS_WARNING);
                request.setAttribute(Constant.MESSAGE, "Username or password is invalid.");
                request.getRequestDispatcher(loginPage).forward(request, response);
            }
        }
    }

    private boolean isReposOperation(String uri) {
        return (uri.equals("/authz.do") || uri.equals("/group.do") || uri.equals("/alias.do") || uri
                .equals("/account.do"));
    }

}

LogoutServlet.java

package sa.cdc.svn.web.security;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import sa.cdc.svn.web.BaseServlet;

public class LogoutServlet extends BaseServlet {
    private static final long serialVersionUID = 9069229823236250773L;

    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {
        String ip = request.getRemoteAddr();
        logInfo(request, "Logout from [" + ip + "]");
        request.getSession().invalidate();
        response.sendRedirect(request.getContextPath() + "/home.do");

    }
}

AccessControlFilter.java

package sa.cdc.svn.web.security;

import java.io.IOException;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;

import sa.cdc.svn.Constant;
import sa.cdc.svn.model.security.Certificate;

public class AccessControlFilter implements Filter {

    @Override
    public void destroy() {
    }

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
            throws IOException, ServletException {
        HttpServletRequest httpRequest = (HttpServletRequest) request;
        HttpServletResponse httpResponse = (HttpServletResponse) response;

        String uri = httpRequest.getRequestURI();
        uri = uri.substring(uri.indexOf('/', 1));
        request.setAttribute("uriClick", uri);
        // no need to filter login.do
        if (uri.equals("/login.do") || uri.equals("/home.do") || uri.equals("/logout.do")
                || uri.equals("/search.do")) {

            chain.doFilter(request, response);
            return;
        }

        HttpSession session = httpRequest.getSession();
        Certificate certificate = (Certificate) session.getAttribute(Constant.SESSION_CERTIFICATE);
        if (certificate != null) {
            // validate user if have permission
            boolean permitted = false;
            if (uri.equals("/authz.do"))
                permitted = certificate.hasRole(Constant.ROLE_CONTROLLER);
            else if (uri.equals("/group.do"))
                permitted = certificate.hasRole(Constant.ROLE_CONTROLLER);
            else if (uri.equals("/alias.do"))
                permitted = certificate.hasRole(Constant.ROLE_CONTROLLER);
            else if (uri.equals("/account.do"))
                permitted = certificate.hasRole(Constant.ROLE_CONTROLLER);
            else if (uri.equals("/davsvn.do"))
                permitted = certificate.hasRole(Constant.ROLE_ADMINISTRATOR);
            else if (uri.equals("/admin.do"))
                permitted = certificate.hasRole(Constant.ROLE_ADMINISTRATOR);
            else if (uri.equals("/webclient.do"))
                permitted = certificate.hasRole(Constant.ROLE_ADMINISTRATOR);
            if (permitted) {
                // transfer to next filter
                chain.doFilter(request, response);
            } else {
                // redirect to deny page
                request.setAttribute(Constant.STATUS, Constant.STATUS_WARNING);
                request.setAttribute(Constant.MESSAGE, "Have no permission to access " + uri + ".");
                request.getRequestDispatcher("/WEB-INF/pages/deny.jsp").forward(request, response);
            }
        } else {
            // go to login to get a certificate, forward to this uri after login
            httpResponse.sendRedirect(httpRequest.getContextPath() + "/login.do?uri=" + uri);
        }
    }
}

meta.jsp


<meta http-equiv="Expires" content="0" />
<meta http-equiv="Pragma" content="no-cache" />
<meta http-equiv="Cache-Control" content="no-store" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="icon" href="<c:url value='/images/favicon.ico'/>" type="image/x-icon" />
<link rel="shortcut icon" href="<c:url value='/images/favicon.ico'/>" type="image/x-icon" />
<link rel="stylesheet" type="text/css" href="<c:url value='/styles/global.css'/>" />
<script type="text/javascript" src="<c:url value='/scripts/global.js'/>"></script>

taglib.jsp

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
<%@ taglib uri="http://java.sun.com/jsp/jstl/fmt" prefix="m"%>
<%@ taglib uri="http://java.sun.com/jsp/jstl/sql" prefix="s"%>
<%@ taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c"%>
<%@ taglib uri="http://java.sun.com/jsp/jstl/functions" prefix="n"%>

Status.jsp

<%@ include file="/common/taglibs.jsp"%>
<div id="status">
<span style="background:url(<c:url value='/images/${status}.gif'/>) no-repeat;">
     <c:out value="${message}" />
</span>
</div>

Friday, May 01, 2009

Install Solaris10 u3 sparc on Fujitsu PRIMEPOWER450

关键字

Fujitsu、Solaris、Sparc

操作环境

Fujitsu PRIMEPOWER450 2x SPARC64 V
Solaris 10 u3 sparc

IP: 61.144.43.227
Netmask: 61.144.43.224 255.255.255.240

综合知识

参考资料
Solaris 10 11/06 Installation Guide: Basic Installations
http://docs.sun.com/app/docs/doc/819-6394

下载文件（PC机）
CD: http://www.sun.com/download/sdl.jsp?5005588c-36f3-11d6-9cec-fc96f718e113=1
DVD: http://www.sun.com/download/sdl.jsp?5005588c-36f3-11d6-9cec-fc96f718e113=1
sol-10-u3-ga-sparc-dvd-iso-a.zip
sol-10-u3-ga-sparc-dvd-iso-b.zip
sol-10-u3-ga-sparc-dvd-iso-c.zip
sol-10-u3-ga-sparc-dvd-iso-d.zip
sol-10-u3-ga-sparc-dvd-iso-e.zip
sol-10-u3-companion-ga-iso.zip
sol-10-u3-ga-sparc-v1-iso.zip
md5sum-sparc.list.txt

任务介绍
在小型机上安装Solaris 10 sparc，双CPU，双网卡，三硬盘，完全安装。
原则1：推荐在安装界面配置好网卡的所有参数。

开始安装

制作安装文件

将sol-10-u3-ga-sparc-v1-iso.zip解压后刻录成安装光盘。

将下面五个文件解压：
sol-10-u3-ga-sparc-dvd-iso-a.zip
sol-10-u3-ga-sparc-dvd-iso-b.zip
sol-10-u3-ga-sparc-dvd-iso-c.zip
sol-10-u3-ga-sparc-dvd-iso-d.zip
sol-10-u3-ga-sparc-dvd-iso-e.zip

然后执行如下命令：
copy /b sol-10-u3-ga-sparc-dvd-iso-a + sol-10-u3-ga-sparc-dvd-iso-b + sol-10-u3-ga-sparc-dvd-iso-c + sol-10-u3-ga-sparc-dvd-iso-d + sol-10-u3-ga-sparc-dvd-iso-e sol-10-u3-ga-sparc-dvd.iso

得到DVD镜像：sol-10-u3-ga-sparc-dvd.iso

将sol-10-u3-ga-sparc-dvd.iso放在第2个硬盘/dev/dsk/c1t0d0s0上

正式开始安装

用安装光盘引导进入安装界面，选择Solaris Interactive安装方式。
Network　Connectivity: Networked
Please select the interfaces you want to configure: hme0
Use DHCP for hme0: No
Host Name for hme0: jwcdb
IP Address for hme0: 61.144.43.227
Netmask for hme0: 255.255.255.240
Enable IPv6 for hme0: No
Set the Default Route for hme0: Specify one
Route IP Address for hme0: 61.144.43.234
Enable Kerberos: No
Name Services: DNS
Domain Name: pyp.edu.cn
Server's IP Address:
202.96.134.133
61.144.56.101
DNS Search List:
pyp.edu.cn
Accept
Select Software Localizations: Asia->Chinese(zh)
Select System Locale: English(POSIX)(C)
Package:Entire
Solaris: 44996

中间要选择启动相关网络服务，否则无法XManager登录。

预安装后重启。

下面加载DVD安装盘：
? mkdir /main
? mkdir /mnt/iso
? mount -F ufs /dev/dsk/c1t0d0s0 /main
? lofiadm -a /main/solaris10u3/sol-10-u3-ga-sparc-dvd.iso
? mount -F hsfs -o ro /dev/lofi/1 /mnt/iso

安装后期路过语言安装，因为已包含在前面的安装中。

初始设置：

允许root进行ssh登录：
? vi /etc/ssh/sshd_config
PermitRootLogin yes
svcadm restart network/ssh

将root的shell类型改为bash：
? vi /etc/passwd
root:x:0:0:Super-User:/:/usr/bin/bash

设置bash提示符为“? ”：
? vi /etc/profile
# Add by litchi
PATH=/usr/bin:/usr/sbin:/usr/sadm/bin:/usr/sfw/bin:/usr/ucb:/etc:.
PS1='? '

在登录时显示系统指定的消息：
? more /etc/motd
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
*************************************************************
*** Jiaowu Education System Provided by PanYu Polytechinc ***
*************************************************************

禁用Telnet
禁用Telnet：
? inetadm | grep telnet
enabled online svc:/network/telnet:default
? inetadm -d telnet

创建用户：

FTP欢迎信息：
修改FTP欢迎信息：
? vi /etc/ftpd/welcome.msg
"/etc/ftpd/welcome.msg" 1 line, 55 characters
Welcome to Jiaowu System Provided by Panyu Polytechnic

加快启动时间的修改：
# vi /boot/solaris/strap.rc
Options timeout=3^M
默认为10，现设为3

# vi /etc/bootrc
set boot_timeout 0
默认为5，现设为0

# vi /boot/solaris/bootenv.rc
setprop auto-boot-timeout 0
默认为5，现设为0

Install Soaris 10 u1 x86 on PC

任务介绍

在普通台式机上安装Solaris 10 x86，双网卡，双硬盘，完全安装。
原则1：推荐在安装界面配置好网卡的所有参数。

开始安装

将Solaris 10 Disk 1刻录成安装光盘，
将Solaris 10 DVD镜像文件和Language镜像文件放在第2个逻辑分区上，
在DOS下或Windows下准备一个未分配的主分区。
用安装光盘引导进入安装界面，选择Solaris Interactive安装方式。
Network　Connectivity: Networked
Please select the interfaces you want to configure: rtls0
Use DHCP for rtls0: No
Host Name for rtls0: outer
IP Address for rtls0: 192.168.224.251
Netmask for rtls0: 255.255.255.0
Enable IPv6 for rtls0: No
Set the Default Route for rtls0: Specify one
Route IP Address for rtls0: 192.168.224.1
Enable Kerberos: No
Name Services: DNS
Domain Name: c204.com
Server's IP Address:
202.96.134.133
202.96.128.68
DNS Search List:
c204.com
Accept
Select Software Localizations: Asia->Chinese(zh)
Select System Locale: English(POSIX)(C)
Package:Entire
Solaris: 44996

预安装后重启。

下面加载DVD安装盘：
? mkdir /mnt/saved
? mkdir /mnt/iso
? mount -F pcfs /dev/dsk/c0d0p3:2 /mnt/saved
? lofiadm -a /mnt/saved/Solaris/sol-10-GA-x86-dvd-iso.iso
? mount -F hsfs -o ro /dev/lofi/1 /mnt/iso

下面加载Language安装盘：
? umount /mnt/iso
? lofiadm -d /dev/lofi/1
? lofiadm -a /mnt/saved/Solaris/sol-10-lang-GA-x86-iso.iso
? mount -F hsfs -o ro /dev/lofi/1 /mnt/iso

初始设置：
允许root进行ssh登录：
? vi /etc/ssh/sshd_config
PermitRootLogin yes
svcadm restart network/ssh

将root的shell类型改为bash：
? vi /etc/passwd
root:x:0:0:Super-User:/:/usr/bin/bash

设置bash提示符为“? ”：
? vi /etc/profile
# Add by litchi
PATH=/usr/bin:/usr/sbin:/usr/sadm/bin:/usr/sfw/bin:/usr/ucb:/etc:.
PS1='? '

在登录时显示系统指定的消息：
# more /etc/motd
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
*************************************************************
* Provided By Litchi chenliqun@pyp.edu.cn PanYu Polytechinc *
*************************************************************

加快启动时间的修改：
# vi /boot/solaris/strap.rc
Options timeout=3^M
默认为10，现设为3

# vi /etc/bootrc
set boot_timeout 0
默认为5，现设为0

# vi /boot/solaris/bootenv.rc
setprop auto-boot-timeout 0
默认为5，现设为0

Install Oracle 10g Release 2 on Solaris 10 u3 sparc

关键字

Solaris、Oracle、Sparc

操作环境

Fujitsu PRIMEPOWER450 2x SPARC64 V
Solaris 10 u3 sparc
Oracle 10g Release 2

综合知识

参考资料
安装帮助：B19306-01.zip内：
B19306_01\install.102\b15690\toc.htm
Sun ISV：http://isv.sun.com.cn/techdocs/0604/Solaris10_oracle.jsp

下载文件（PC机）
doc: http://www.oracle.com/technology/documentation/database10gr2.html
data: http://www.oracle.com/technology/software/products/database/oracle10g/htdocs/10201sol64soft.html
B19306-01.zip
10gr2_db_sol.cpio.gz

准备安装文件

解压：
10gr2_db_sol.cpio.gz

上传到服务器：
10gr2_db_sol.cpio

提取：
? cpio -idmv < 10gr2_db_sol.cpio

检查系统环境

用root登录服务器

查看系统版本（要求5.10 64-bit sparcv9）
? uname -a
SunOS jwcdb 5.10 Generic_118833-33 sun4us sparc FJSV,GPUZC-M
? isainfo -kv
64-bit sparcv9 kernel modules

查看物理内存（要求1024MB以上）
? /usr/sbin/prtconf | grep "Memory size"
Memory size: 4096 Megabytes

查看交换文件（视内存情况）
? /usr/sbin/swap -lh
swapfile dev swaplo blocks free
/dev/dsk/c0t0d0s1 32,129 16 12286064 12286064

临时文件空间（要求400M以上）
? df -h /tmp
Filesystem size used avail capacity Mounted on
swap 7.4G 88K 7.4G 1% /tmp

查看硬盘空间（要求2G以上）
? df -h
Filesystem size used avail capacity Mounted on
/dev/dsk/c0t0d0s0 12G 3.7G 7.7G 33% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 7.4G 1.1M 7.4G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
fd 0K 0K 0K 0% /dev/fd
swap 7.4G 88K 7.4G 1% /tmp
swap 7.4G 64K 7.4G 1% /var/run
/dev/dsk/c1t1d0s6 67G 2.9G 64G 5% /data -for oracle data
/dev/dsk/c1t0d0s0 67G 11G 55G 17% /main -for oracle software
/dev/dsk/c0t0d0s7 50G 2.7G 47G 6% /export/home

查看以下软件包是否存在
? pkginfo -i SUNWarc SUNWbtool SUNWhea SUNWlibm SUNWlibms SUNWsprot \
> SUNWsprox SUNWtoo SUNWi1of SUNWi1cs SUNWi15cs SUNWxwfnt
system SUNWarc Lint Libraries (usr)
system SUNWbtool CCS tools bundled with SunOS
system SUNWhea SunOS Header Files
system SUNWi1of ISO-8859-1 (Latin-1) Optional Fonts
system SUNWlibm Math & Microtasking Library Headers & Lint Files (Usr)
system SUNWlibms Math & Microtasking Libraries (Usr)
system SUNWsprot Solaris Bundled tools
system SUNWtoo Programming Tools
system SUNWxwfnt X Window System platform required fonts
ERROR: information for "SUNWi1cs" was not found
ERROR: information for "SUNWi15cs" was not found
ERROR: information for "SUNWsprox" was not found

这里需要安装两个包：SUNWi1cs和SUNWi15cs

加载DVD安装镜像：
? lofiadm -a /main/solaris10u3/sol-10-u3-ga-sparc-dvd.iso
/dev/lofi/1
? mount -F hsfs -o ro /dev/lofi/1 /mnt/iso

安装这两个包
? pkgadd -d /mnt/iso/Solaris_10/Product/ SUNWi1cs
? pkgadd -d /mnt/iso/Solaris_10/Product/ SUNWi15cs

由10版本已将SUNWsprox合并到SUNWsprot，所以不必该程序包了。

卸载加载的文件
? umount /mnt/iso/
? lofiadm -d /dev/lofi/1

由于使用的Solaris 10 u3比较新，就不打补丁了。

检查网络配置

确保files出现在hosts行：
? cat /etc/nsswitch.conf | grep hosts
# DNS for hosts lookups, otherwise it does not use any other naming service.
# "hosts:" and "services:" in this file are used only if the
hosts: files dns
# before searching the hosts databases.

主机名：
? hostname
jwcdb

域名（不应该有任何值）：
? domainname

查看是否使用了全限定主机名：
? cat /etc/hosts | grep 'jwcdb'
61.144.43.227 jwcdb jwcdb.pyp.edu.cn loghost

创建用户和组

创建Oracle Inventory组

如果已安装Oracle，可以查看到如下输出：
? more /var/opt/oracle/oraInst.loc
inventory_loc=/main/oracle/oraInventory
inst_group=oinstall

如果需要，创建oinstall组：
? groupadd oinstall

创建OSDBA组

? groupadd dba

如果OSOPER组

? groupadd oper

创建oracle用户：

查看是否存在oracle用户
? id -a oracle

创建
? useradd -g oinstall -G dba,oper -d /export/home/oracle -s /usr/bin/bash -c 'Oracle Software Owner' -m oracle

设置密码
? passwd -r files oracle

保证nobody用户存在：
? id -a nobody
uid=60001(nobody) gid=60001(nobody) groups=60001(nobody)

配置系统参数

? vi /etc/system
set noexec_user_stack=1

查看有没有为oracle创建资源项目：
? vi /etc/project | grep oracle

如果没有，如下创建：
? projadd -U oracle -c "oracle database" user.oracle
? projmod -sK "roject.max-shm-memory=(priv,10G,deny)" user.oracle

切换到Oracle检查一下：
% prctl -n project.max-shm-memory -i project user.oracle

创建所需目录

程序目录
? mkdir -p /main/oracle
? chown -R oracle:oinstall /main/oracle
? chmod -R 775 /main/oracle

数据目录
? mkdir -p /data/oracle
? chown -R oracle:oinstall /data/oracle
? chmod -R 775 /data/oracle

备份目录
? mkdir -p /main/oracle/flash_recovery_area
? chown -R oracle:oinstall /main/oracle/flash_recovery_area
? chmod -R 775 /main/oracle/flash_recovery_area

配置环境参数

用oracle登录
? su - oracle

设置环境参数
? vi .profile
umask 022
ORACLE_BASE=/main/oracle
ORACLE_SID=ORCL
export ORACLE_BASE ORACLE_SID

重新登录查看
% umask
0022
% env | more
-bash-3.00$ env | more
......
ORACLE_SID=ORCL
ORACLE_BASE=/main/oracle
......

重启系统
? reboot

开始安装过程

确保系统可以运行图形界面或可以远程运行图形界面，在此使用的远程桌面是X Manager 2.0
打开XBrowser，输入服务器的IP地址即可连上。

用oracle用户登录

运行database目录下的runInstaller

Select Installation Method
Advanced Installation

Specify Inventory directory and credentials
/opt/oracle/oraInventory
oinstall

Select Installation Type
Enterprise Edition

Sepcify Home Details
Name: OraDb10g_home1
Path: /main/oracle/oracle/product/10.2.0/db_1

Product-Specific Prerequisite Checks
全部通过即可！

Select Configuration Option
Create a database

Select Database Configuration
General Purpose

Specify Database Configuration Options
Global Database Name:cdtdb SID:ORCL
Select Database Character set: Simplified Chinese ZHS16GBK
选中Create database with sample schemas

Select Database Management Option
选中Use Database Control for Database Management
Enable Email Notification
61.144.43.235
sunkist@pyp.edu.cn

Specify Database Storage Option
选中File System
location:/main/oracle/oradata

Specify Backup and Recovery Options
选中Automated backups
location:/main/oracle/flash_recovery_area

Specify Database Schema Passwords
统一使用一个密码：***

Summary

Execute Configuration Scripts
按照提示，用root用户运行脚本

安装完成！

数据库实例安装日志：
/main/oracle/oracle/product/10.2.0/db_1/cfgtoollogs/dbca/ORCL

iSQL*Plus URL:
http://jwcdb:5560/isqlplus

iSQL*Plus DBA URL:
http://jwcdb:5560/isqlplus/dba

Enterprise Manager 10g Database Control URL:
http://jwcdb:1158/em
用sys用户以sysdba身份登录

完成服务配置

用oracle登录
? su - oracle

修改配置文件
% vi .profile
# This is the default standard profile provided to a user.
# They are expected to edit it to meet their own needs.

MAIL=/usr/mail/${LOGNAME:?}
PS1='% '
umask 022
ORACLE_BASE=/main/oracle
export ORACLE_BASE
ORACLE_SID=ORCL
export ORACLE_SID
ORACLE_HOME=$ORACLE_BASE/oracle/product/10.2.0/db_1
export ORACLE_HOME
NLS_LANG="simplified chinese"_china.zhs16gbk --如果想使用英文，注释掉此行
export NLS_LANG
PATH=/usr/bin:/usr/sbin:$ORACLE_HOME/bin:/usr/sadm/bin:/usr/sfw/bin:/usr/ucb:/etc:.
export PATH

执行脚本：
% sqlplus "/ AS SYSDBA"
SQL> @?/rdbms/admin/utlrp.sql

开机自动运行

正确配置数据库实例入口
% vi /var/opt/oracle/oratab
ORCL:/main/oracle/oracle/product/10.2.0/db_1:Y

修正dbstart脚本：
% vi /main/oracle/oracle/product/10.2.0/db_1/bin/dbstart

将 ORACLE_HOME_LISTNER=/ade/vikrkuma_new/oracle
改为 ORACLE_HOME_LISTNER=$ORACLE_HOME

将 ORATAB=/etc/oratab
改为 ORATAB=/var/opt/oracle/oratab

创建自动运行脚本
% vi /etc/init.d/oracle
"/etc/init.d/oracle" 24 lines, 911 characters
#!/bin/sh
ORA_HOME=/main/oracle/oracle/product/10.2.0/db_1
ORA_OWNER=oracle
if [ ! -f $ORA_HOME/bin/dbstart ]
then
echo "Oracle startup: cannot start"
exit
fi
case "$1" in
'start') # Start the Oracle databases and listeners
echo "Start Oracle Database 10g Release 2"
su - $ORA_OWNER -c "$ORA_HOME/bin/dbstart"
su - $ORA_OWNER -c "$ORA_HOME/bin/lsnrctl start"
su - $ORA_OWNER -c "$ORA_HOME/bin/emctl start dbconsole"
su - $ORA_OWNER -c "$ORA_HOME/bin/isqlplusctl start"
;;
'stop') # Stop the Oracle databases and listeners
echo "Stop Oracle Database 10g Release 2"
su - $ORA_OWNER -c "$ORA_HOME/bin/isqlplusctl stop"
su - $ORA_OWNER -c "$ORA_HOME/bin/emctl stop dbconsole"
su - $ORA_OWNER -c "$ORA_HOME/bin/lsnrctl stop"
su - $ORA_OWNER -c "$ORA_HOME/bin/dbshut"
;;
esac

建立符号连接
? ln -s /etc/init.d/oracle /etc/rc2.d/S99oracle
? ln -s /etc/init.d/oracle /etc/rc0.d/K10oracle

Richie's Blog

Friday, May 08, 2009

Tomcat Distilled - Tips and Skills

Thursday, May 07, 2009

MySQL Distilled - A practical introduction

Monday, May 04, 2009

The Regular Expression Introduction

Sunday, May 03, 2009

The Regular Expression Introduction

Computer Technology English Collection 1

Saturday, May 02, 2009

Using Regular Expression to Parse the authz File

Using JSP/Servlet/Filter to Implement Login Authorization

Friday, May 01, 2009

Install Solaris10 u3 sparc on Fujitsu PRIMEPOWER450

Install Soaris 10 u1 x86 on PC

Install Oracle 10g Release 2 on Solaris 10 u3 sparc

Author

Archive

Labels

Links