老秘网_材夜思范文

标题: 网页采集程序(超级简单版) [打印本页]

作者: 福建老秘    时间: 2010-7-20 19:53
标题: 网页采集程序(超级简单版)
! ]( p4 B; G+ ]
# k( ~" L' M' x5 v3 s9 S
网页采集程序(超级简单版)
# y1 g) |. ?0 \( B
. t* V# S# z; y2 j9 O7 X

网页采集程序(超级简单版)
01 protected void btn_click(object sender, EventArgs e) 

( g/ O! ]: h* i, U4 {, b

02         { 

0 j, M% e* q( H% K0 B

03             //方法一: 

* j( L3 {* q- f* ]# U

04             //System.Net.WebClient wc = new System.Net.WebClient(); 

" X3 o/ n2 D4 }2 E/ b! D1 E! A

05             //byte[] b = wc.DownloadData("http://www.baidu.com"); 

8 | E/ O) e7 [3 J, A6 Q$ D

06             //string html = System.Text.Encoding.GetEncoding("gb2312").GetString(b); 

) x' q; }9 |# }- h; N3 l' \7 d

07             //html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

3 y+ I& s. H2 n* V9 d2 G7 y* {

08             //html = html.Substring(0, html.IndexOf("</p>")); 

) @* C5 e; a+ q) p

09             //Response.Write(html); 

2 v7 a/ |" j! o" o. G/ r, r6 N4 F5 x

10   

3 H; d4 ?( N2 @6 C) u% X

11             //方法二: 

, L- D( A0 z$ _5 I4 ^0 n: A

12         //获取整个网页 

- R& J, n$ W1 ?( R8 j) l1 t$ i

13             System.Net.WebClient wc = new System.Net.WebClient(); 

/ B% k3 H, g% W7 w- Z+ n

14             System.IO.Stream sm = wc.OpenRead("http://www.baidu.com"); 

! K. K( E4 w/ H: _& }4 W( f* v

15             System.IO.StreamReader sr = new System.IO.StreamReader(sm, System.Text.Encoding.Default, true, 256000); 

. \7 W: L. h. h; v6 W; g6 f

16             string html = sr.ReadToEnd(); 

6 I5 |* _, I, l8 o- a4 e: X7 `

17             sr.Close(); 

6 N/ m! Q3 p& h- Y7 }

18             //根据规则获取想要的内容 

- ~4 _& V" C" Q3 q7 b2 U' k

19             html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

" U/ {7 M1 f& `3 r

20             html = html.Substring(0, html.IndexOf("</p>")); 

1 e1 g% j; V: i$ X

21             Response.Write(html); 

& {! I# _8 b! G

22         }


作者: 福建老秘    时间: 2010-7-20 20:00

http://hereson.javaeye.com/blog/207468






欢迎光临 老秘网_材夜思范文 (http://www.caiyes.cn/) Powered by Discuz! X3.4