Php 爬虫将url作为302重定向,但实际上并没有重定向为什么会发生这种情况?

Php 爬虫将url作为302重定向,但实际上并没有重定向为什么会发生这种情况?,php,.htaccess,redirect,web-crawler,magento-1.7,Php,.htaccess,Redirect,Web Crawler,Magento 1.7,我在一个电子商务网站与magento后端工作。过去一切都很好,但突然不知道从哪里开始。我们正在创建的一些旧URL和所有新URL都没有索引,因为爬虫程序将链接视为302重定向,但如果您看到它是200 ok,则实际上是在浏览器中。当我们使用搜索引擎优化工具时,与搜索引擎优化相关的一切都很好。当我们仔细检查时,我们发现这些302状态链接中的一些正在变为200 ok,但经过一段时间后,它又回到了302。有人能帮我解决这个问题吗。正因为如此,我们新创建的页面都没有被索引。你能发布你的代码以便人们检查它吗?

我在一个电子商务网站与magento后端工作。过去一切都很好,但突然不知道从哪里开始。我们正在创建的一些旧URL和所有新URL都没有索引,因为爬虫程序将链接视为302重定向,但如果您看到它是200 ok,则实际上是在浏览器中。当我们使用搜索引擎优化工具时,与搜索引擎优化相关的一切都很好。当我们仔细检查时,我们发现这些302状态链接中的一些正在变为200 ok,但经过一段时间后,它又回到了302。有人能帮我解决这个问题吗。正因为如此,我们新创建的页面都没有被索引。

你能发布你的代码以便人们检查它吗?你是否检查了服务器的日志以检查这些页面是否正确地提供给用户和机器人程序?有太多的变量来解释为什么你的页面可能会这样做,你只需要使用老式的侦探工作来找出它。它可能是htaccess中贪婪的正则表达式,也可能是页面中返回错误HTTP响应代码的代码。如果只是针对机器人,那么您还应该检查robots.txt文件。您好,我已经用我们现在运行的.htaccess和tobots.txt文件编辑了这个问题。你能复习一下代码吗。如果你还需要什么,请告诉我
.htaccess
    ##### Add support for SVG Images and CSS3 Pie #####

    AddType image/svg+xml svg svgz
    AddEncoding gzip svgz
    AddType text/x-component .htc
    DirectoryIndex index.php 

##### PHP Settings for your domain #####

<IfModule mod_php5.c> 
    php_value memory_limit 512M 
    php_value max_execution_time 18000 
    php_flag magic_quotes_gpc off 
    php_flag session.auto_start off 
    php_flag suhosin.session.cryptua off 
    php_flag zend.ze1_compatibility_mode Off 
</IfModule>

##### Search Engine redirects and rewrites for SEO purposes #####

<IfModule mod_rewrite.c>
    #RewriteCond %{HTTP_HOST} !^www.alshop.com$ [NC]
    #RewriteRule ^(.*)$ http://www.alshop.com/$1 [R=301,L]

    ##### Redirect away from /index.php and /home   
    ##### Warning: This index.php rewrite will prevent Magento 
    ##### Connect from working. Simply comment out the  
    ##### following two lines of code when using Connect.
    ##### Please note - http://www. if not using www simply use http://

    RewriteCond %{THE_REQUEST} ^.*/index.php
    RewriteRule ^(.*)index.php$ http://www.alshop.com/$1 [R=301,L]

    ##### Please note - http://www. if not using www simply use http://
    redirect 301 /home http://www.alshop.com

    Options +FollowSymLinks
    RewriteEngine on
    RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
    RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-l
    RewriteRule .* index.php [L]

    RewriteRule ^(.*)$ $1 [NS,E=no-gzip:1,E=dont-vary:1]
</IfModule>


##### mod_deflate compresses your output to lower the file size being sent to the client #####

<IfModule mod_deflate.c>

    php_flag zlib.output_compression off
    SetEnvIfNoCase Request_URI \.(html?|txt|css|js|php|pl)$$ no-gzip dont-vary
</IfModule>


<IfModule mod_ssl.c>
    SSLOptions StdEnvVars 
</IfModule>


##### Header Directives #####

<ifModule mod_headers.c>
    Header unset ETag
    Header unset Last-Modified
</ifModule>



 ##### Disable ETags http://developer.yahoo.com/performance/rules.html#etags #####

    FileETag None


##### Prevent character encoding issues from server overrides #####

    AddDefaultCharset Off
    #AddDefaultCharset UTF-8


##### By default allow all access #####

    #Order allow,deny
    #Allow from all





## robots.txt

User-agent: *


## Crawl-delay parameter: number of seconds to wait between successive requests to the same server.
## Set a custom crawl rate if you're experiencing traffic problems with your server.
# Crawl-delay: 30

## DEVELOPMENT RELATED SETTINGS`

## Do not crawl development files and folders: CVS, svn directories and dump files
Disallow: /CVS
Disallow: /*.svn$
Disallow: /*.idea$
Disallow: /*.sql$
Disallow: /*.tgz$

## GENERAL MAGENTO SETTINGS

## Do not crawl Magento admin page
Disallow: /admin/

## Do not crawl common Magento technical folders
Disallow: /app/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /lib/
Disallow: /pkginfo/
Disallow: /shell/
Disallow: /var/

## Do not crawl common Magento files
Disallow: /api.php
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /get.php
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /README.txt
Disallow: /RELEASE_NOTES.txt

## MAGENTO SEO IMPROVEMENTS

## Do not crawl sub category pages that are sorted or filtered.
Disallow: /*?dir*
Disallow: /*?dir=desc
Disallow: /*?dir=asc
Disallow: /*?limit=all
Disallow: /*?mode*
## Disallow: /*?*

## Do not crawl 2-nd home page copy (example.com/index.php/). Uncomment it only if you activated Magento SEO URLs.
Disallow: /index.php/

## Do not crawl links with session IDs
Disallow: /*?SID=

## Do not crawl checkout and user account pages
Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /customer/account/
Disallow: /customer/account/login/

## Do not crawl seach pages and not-SEO optimized catalog links
Disallow: /catalogsearch/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/

## SERVER SETTINGS

## Do not crawl common server technical folders and files
Disallow: /cgi-bin/
Disallow: /cleanup.php
Disallow: /apc.php
Disallow: /memcache.php
Disallow: /phpinfo.php

## IMAGE CRAWLERS SETTINGS

## Extra: Uncomment if you do not wish Google and Bing to index your images
# User-agent: Googlebot-Image
# Disallow: /
# User-agent: msnbot-media
# Disallow: /

On Fri, May 29, 2015 at 9:51 AM, tech alshop <tech.alshop@gmail.com> wrote:
   .htaccess
    ##### Add support for SVG Images and CSS3 Pie #####

    AddType image/svg+xml svg svgz
    AddEncoding gzip svgz
    AddType text/x-component .htc
    DirectoryIndex index.php 

##### PHP Settings for your domain #####

<IfModule mod_php5.c> 
    php_value memory_limit 512M 
    php_value max_execution_time 18000 
    php_flag magic_quotes_gpc off 
    php_flag session.auto_start off 
    php_flag suhosin.session.cryptua off 
    php_flag zend.ze1_compatibility_mode Off 
</IfModule>

##### Search Engine redirects and rewrites for SEO purposes #####

<IfModule mod_rewrite.c>
    #RewriteCond %{HTTP_HOST} !^www.alshop.com$ [NC]
    #RewriteRule ^(.*)$ http://www.alshop.com/$1 [R=301,L]

    ##### Redirect away from /index.php and /home   
    ##### Warning: This index.php rewrite will prevent Magento 
    ##### Connect from working. Simply comment out the  
    ##### following two lines of code when using Connect.
    ##### Please note - http://www. if not using www simply use http://

    RewriteCond %{THE_REQUEST} ^.*/index.php
    RewriteRule ^(.*)index.php$ http://www.alshop.com/$1 [R=301,L]

    ##### Please note - http://www. if not using www simply use http://
    redirect 301 /home http://www.alshop.com

    Options +FollowSymLinks
    RewriteEngine on
    RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
    RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-l
    RewriteRule .* index.php [L]

    RewriteRule ^(.*)$ $1 [NS,E=no-gzip:1,E=dont-vary:1]
</IfModule>


##### mod_deflate compresses your output to lower the file size being sent to the client #####

<IfModule mod_deflate.c>

    php_flag zlib.output_compression off
    SetEnvIfNoCase Request_URI \.(html?|txt|css|js|php|pl)$$ no-gzip dont-vary
</IfModule>


<IfModule mod_ssl.c>
    SSLOptions StdEnvVars 
</IfModule>


##### Header Directives #####

<ifModule mod_headers.c>
    Header unset ETag
    Header unset Last-Modified
</ifModule>



 ##### Disable ETags http://developer.yahoo.com/performance/rules.html#etags #####

    FileETag None


##### Prevent character encoding issues from server overrides #####

    AddDefaultCharset Off
    #AddDefaultCharset UTF-8


##### By default allow all access #####

    #Order allow,deny
    #Allow from all