老秘网_材夜思范文

标题: 网页采集程序(超级简单版) [打印本页]

作者: 福建老秘    时间: 2010-7-20 19:53
标题: 网页采集程序(超级简单版)
% P& d1 h8 ^. W8 P. |( K3 U& H3 j8 G
* F, P' P/ V# ^% E/ m5 A
网页采集程序(超级简单版)
) E6 |& x* q* i Q% ]$ B
; m- w/ U q& x$ w

网页采集程序(超级简单版)
01 protected void btn_click(object sender, EventArgs e) 

7 q; @% [. Q: V4 |$ o

02         { 

$ _9 n, T6 j; E4 e: I% W

03             //方法一: 

: u# H" C2 W& W5 ?) I% }3 \

04             //System.Net.WebClient wc = new System.Net.WebClient(); 

0 w6 E4 l4 g5 v

05             //byte[] b = wc.DownloadData("http://www.baidu.com"); 

* a z# f% K( |# _

06             //string html = System.Text.Encoding.GetEncoding("gb2312").GetString(b); 

e$ B5 m; R" `

07             //html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

+ V- j- `4 y/ s: q( K) ]

08             //html = html.Substring(0, html.IndexOf("</p>")); 

& `, a7 Z0 Y7 T) o: t" n# i

09             //Response.Write(html); 

0 {4 z' c# l# ?3 L5 Z

10   

6 N3 _4 w* D7 |- u/ ^

11             //方法二: 

5 @4 }4 z8 n8 }3 P3 I. h

12         //获取整个网页 

" p, B8 u* {1 s2 g

13             System.Net.WebClient wc = new System.Net.WebClient(); 

2 }! Y4 p. R5 z, ~8 O4 P2 f: k

14             System.IO.Stream sm = wc.OpenRead("http://www.baidu.com"); 

, G5 u6 _$ }3 v/ }

15             System.IO.StreamReader sr = new System.IO.StreamReader(sm, System.Text.Encoding.Default, true, 256000); 

- q) s0 T2 U- V# O/ p- |- ]

16             string html = sr.ReadToEnd(); 

; ~3 P3 B: j U

17             sr.Close(); 

( h q2 L/ h; q9 l4 U8 B

18             //根据规则获取想要的内容 

' D7 u7 p, J6 Q' o) G

19             html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

q- t7 _: C( l; s; B% V

20             html = html.Substring(0, html.IndexOf("</p>")); 

! m- t5 L7 y/ M. K+ u

21             Response.Write(html); 

) q! \+ ~! {3 @ h' N# ]& c

22         }


作者: 福建老秘    时间: 2010-7-20 20:00

http://hereson.javaeye.com/blog/207468






欢迎光临 老秘网_材夜思范文 (http://www.laomiw.com/) Powered by Discuz! X3.4