Lower-cased URLs in Apache Access Log

Occasionally, I see requests in my Apache access log for files that should be of mixed-case come in as all lower-case.

For example, a file named /Pages/Home.php will be requested as /pages/home.php.

It doesn't happen a lot, but enough to make me wonder…

Has anyone seen this before? Know what causes it?

6 Replies

The only reason I can think of is that some of the links on your pages are lower case, or somebody else linked to your website in lower case. (Even Internet Explorer isn't stupid enough to change the letter case of URLs.)

The log files should contain the user agent and the referer (if any). These would help you identify the offending browser and/or link.

````
xxx.xxx.xxx.xxx - - [05/Dec/2011:08:34:25 -0800] "GET /xxx/login-MySQL.php HTTP/1.1" 404 1210 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 1068) AppleWebKit/534.52.7 (KHTML, like Gecko) Version/5.1.2 Safari/534.52.7"

xxx.xxx.xxx.xxx - - [05/Dec/2011:10:05:26 -0800] "GET /xxx/teleprompter-richtext2.php HTTP/1.1" 404 2958 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"

xxx.xxx.xxx.xxx - - [05/Dec/2011:10:05:47 -0800] "GET /xxx/teleprompter-richtext2.php HTTP/1.1" 404 2958 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0; GTB6.3; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; InfoPath.2; OfficeLiveConnector.1.5;
````

Here are three from today. The user agent strings are all different. In fact, the only similarity I see in the entries is that none of them provide a referrer.

I'm fairly confident it's not my links. I searched my entire code base for "teleprompter-richtext2.php", and there weren't any occurrences of it.

@jzimmerlin:

In fact, the only similarity I see in the entries is that none of them provide a referrer.
Sometimes people type URLs into the address bar :)

Otherwise, maybe it's some sort of broken user script?

Could also be bots of some sort – imagine something stupid, and there's an obscure search engine or email harvester that does it.

Yeah, I don't know what to make of it. The URLs aren't really public, in the sense that they're not posted on the home page or something. They're only for clients. So I don't know how a bot would discover them.

@jzimmerlin:

Yeah, I don't know what to make of it. The URLs aren't really public, in the sense that they're not posted on the home page or something. They're only for clients. So I don't know how a bot would discover them.

Because your clients are stupid, typical end-users and their computers are massively infected with viruses or other mal/spyware and every link they visit is sent somewhere else.

Hit arin.net and find out who owns the IPs of the computers visiting your site, that, the time stamp and the page they are trying to hit are nearly the only parts of those entries that are not spoof-able.

For the most part, don't bother, those log entries you listed are all 404 errors, so they are not even pages that exist on your site. Just people who have malware installed on their computers trying to find software that's easily compromised.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct