Python yordamida Veb-saytlar uchun Foydalanuvchi Agentlarini Aniqlash

User-Agent (foydalanuvchi agenti) – bu mijoz (foydalanuvchi yoki dastur) tomonidan serverga yuboriladigan so‘rov sarlavhasining bir qismi bo‘lib, u orqali veb-saytlar mijoz qurilmasi haqida ma’lumot olishadi. Bu agentning vazifasi – foydalanuvchi qurilmasining turi, operatsion tizimi va brauzeri haqida ma’lumot berishdir. Serverlar User-Agent orqali qaysi qurilmadan foydalanilayotganini bilib olishlari va kontentni moslashtirishlari mumkin.

Foydalanuvchi Agentining Vazifalari

  1. Kontentni moslashtirish: Brauzer va qurilmaga mos dizayn va tarkib.

  2. Statistik tahlil: Brauzerlar, operatsion tizimlar va qurilmalar statistikasi.

  3. Botlarni aniqlash: Ma’lumotlarni o‘zlashtiruvchi botlarni aniqlash.

Foydalanuvchi Agentlarining Strukturasi

Oddiy User-Agent qatorini ko‘rib chiqamiz:

plaintextCopy codeMozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36

Bu qator foydalanuvchi agentining brauzer turi, operatsion tizim va boshqa komponentlar haqidagi ma’lumotlarini beradi. Strukturasi:

  • Brauzer (yoki dastur) turi: Mozilla/5.0

  • Operatsion tizim: (Windows NT 10.0; Win64; x64)

  • Render dvigateli: AppleWebKit/537.36 (KHTML, like Gecko)

  • Brauzer nomi va versiyasi: Chrome/92.0.4515.107

  • Qo‘shimcha ma’lumot: Safari/537.36

1 Python’da Foydalanuvchi Agentini Sozlash

Python’da requests kutubxonasidan foydalanib, veb-saytlarga so‘rov yuborishda User-Agent ni qo‘lda sozlash mumkin.

O‘rnatish

requests kutubxonasini quyidagi buyruq bilan o‘rnating:

pip install requests

Oddiy Foydalanuvchi Agentni Sozlash

Quyidagi funksiyada veb-saytga so‘rov yuborishda User-Agent ni qanday sozlash mumkinligi ko‘rsatilgan:

import requests

def get_page_with_user_agent(url, user_agent):
    """
    Berilgan URL manzilga o'zgartirilgan User-Agent bilan so'rov yuborish.
    """
    headers = {
        "User-Agent": user_agent  # Foydalanuvchi agentini sozlash
    }
    
    response = requests.get(url, headers=headers)  # GET so'rovini yuborish
    return response

# Sinov uchun User-Agent va URL
url = "https://httpbin.org/headers"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"
response = get_page_with_user_agent(url, user_agent)

# Javobni chiqarish
print(response.json())

Tahlil:

  • headers = {"User-Agent": user_agent}: User-Agent qiymatini sozlaydigan sarlavha (header) o‘zgaruvchisi yaratiladi.

  • requests.get(url, headers=headers): Berilgan URL manzilga headers parametri bilan so‘rov yuboriladi, bu so‘rovda User-Agent kiritilgan.

2 Foydalanuvchi Agentlari Ro‘yxati bilan Random Agentni Tanlash

Ba’zan bir nechta User-Agent larni ishlatish foydali bo‘ladi (masalan, saytga skanerlash yoki tahlil qilganda). Quyida ro‘yxatdagi User-Agent lardan tasodifiy tanlab so‘rov yuborish funksiyasi keltirilgan.

import random

# Foydalanuvchi agentlari ro'yxati
user_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Safari/605.1.15",
    "Mozilla/5.0 (iPhone; CPU iPhone OS 14_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Mobile/15E148 Safari/604.1"
]

def get_random_user_agent():
    """
    Foydalanuvchi agentlaridan tasodifiy User-Agent qaytarish.
    """
    return random.choice(user_agents)

def get_page_random_user_agent(url):
    """
    Tasodifiy User-Agent bilan URL'ga so'rov yuborish.
    """
    user_agent = get_random_user_agent()
    headers = {"User-Agent": user_agent}
    
    response = requests.get(url, headers=headers)
    return response

# URL manzil va tasodifiy User-Agent bilan sinov
url = "https://httpbin.org/headers"
response = get_page_random_user_agent(url)

# Javobni chiqarish
print(response.json())

Tahlil:

  • random.choice(user_agents): user_agents ro‘yxatidan tasodifiy User-Agent tanlaydi.

  • headers = {"User-Agent": user_agent}: User-Agent tanlangan qiymat bilan sozlanadi va keyinchalik so‘rovda ishlatiladi.

3 User-Agent orqali Brauzer Tahlili

User-Agent ni tahlil qilish orqali brauzer va qurilma haqidagi ma’lumotlarni ajratib olish mumkin. Quyidagi misolda User-Agent ni brauzer va operatsion tizimga ajratish funksiyasi keltirilgan.

import re

def analyze_user_agent(user_agent):
    """
    User-Agent ma'lumotidan brauzer va operatsion tizimni ajratib olish.
    """
    browser = "Noma'lum"
    os = "Noma'lum"
    
    # Brauzerni aniqlash
    if "Chrome" in user_agent:
        browser = "Chrome"
    elif "Safari" in user_agent:
        browser = "Safari"
    elif "Firefox" in user_agent:
        browser = "Firefox"
    
    # Operatsion tizimni aniqlash
    if "Windows" in user_agent:
        os = "Windows"
    elif "Mac OS X" in user_agent:
        os = "Mac OS X"
    elif "iPhone OS" in user_agent:
        os = "iOS"
    elif "Android" in user_agent:
        os = "Android"
    
    return browser, os

# User-Agentni tahlil qilish
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"
browser, os = analyze_user_agent(user_agent)
print("Brauzer:", browser)
print("Operatsion tizim:", os)

Tahlil:

  • if "Chrome" in user_agentUser-Agent tarkibida "Chrome" matni mavjudligini tekshiradi va brauzerni aniqlaydi.

  • if "Windows" in user_agentUser-Agent tarkibida "Windows" mavjudligini tekshiradi va operatsion tizimni aniqlaydi.

5 Veb-sayt Javobidagi Foydalanuvchi Agentni Tekshirish

Veb-saytlar so‘rovlarni qabul qilganda User-Agent ni qayta ishlaydi va bu ma’lumotni tekshirish orqali biz o‘zimiz yuborgan User-Agent ni ko‘rib olishimiz mumkin.

def check_user_agent_on_site(url, user_agent):
    """
    URL'ga yuborilgan so'rovda User-Agentni tekshirish.
    """
    headers = {"User-Agent": user_agent}
    response = requests.get(url, headers=headers)
    response_json = response.json()
    
    # JSON tarkibidagi User-Agent ma'lumotini chiqarish
    print("Yuborilgan User-Agent:", user_agent)
    print("Serverdan qaytgan User-Agent:", response_json["headers"]["User-Agent"])

# Sinov uchun URL va User-Agent
url = "https://httpbin.org/headers"
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
check_user_agent_on_site(url, user_agent)

Tahlil:

  • response_json["headers"]["User-Agent"] – JSON javobidan User-Agent qiymatini oladi va yuborilgan agent bilan solishtiradi.

5 Foydalanuvchi Agentlari Orasida Aylanma So‘rov Yuborish

Quyidagi funksiya User-Agent lar orasidan har so‘rovda birini aylanma (tasodifiy emas) tarzda tanlaydi va saytga so‘rov yuboradi.

import itertools

# Aylanma tarzda foydalanish uchun user_agents listini itertools.cycle ga o'rnatamiz
user_agents_cycle = itertools.cycle(user_agents)

def get_page_cyclic_user_agent(url):
    """
    Aylanma (cycle) usulida User-Agent bilan so'rov yuborish.
    """
    user_agent = next(user_agents_cycle)
    headers = {"User-Agent": user_agent}
    
    response = requests.get(url, headers=headers)
    return response

# URL va aylanma User-Agent bilan so'rov yuborish
url = "https://httpbin.org/headers"
for _ in range(5):  # 5 ta so'rov yuborish
    response = get_page_cyclic_user_agent(url)
    print(response.json())

Tahlil:

  • user_agents_cycle = itertools.cycle(user_agents)itertools.cycle yordamida user_agents ro‘yxati bo‘yicha aylanma generator yaratiladi.

  • user_agent = next(user_agents_cycle) – Har safar yangi so‘rov yuborilganda keyingi User-Agent ni tanlaydi.

6 To‘liq Dastur

Quyidagi dastur barcha funktsiyalarni birlashtirgan to‘liq dastur bo‘lib, veb-saytga User-Agent lar orqali so‘rov yuborish, brauzer va operatsion tizimni aniqlash, va aylanma User-Agent lar bilan ishlash imkonini beradi.

import requests
from bs4 import BeautifulSoup
import random
import re
import itertools

user_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Safari/605.1.15",
    "Mozilla/5.0 (iPhone; CPU iPhone OS 14_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Mobile/15E148 Safari/604.1"
]
user_agents_cycle = itertools.cycle(user_agents)

def get_random_user_agent():
    return random.choice(user_agents)

def analyze_user_agent(user_agent):
    browser = "Noma'lum"
    os = "Noma'lum"
    
    if "Chrome" in user_agent:
        browser = "Chrome"
    elif "Safari" in user_agent:
        browser = "Safari"
    elif "Firefox" in user_agent:
        browser = "Firefox"
    
    if "Windows" in user_agent:
        os = "Windows"
    elif "Mac OS X" in user_agent:
        os = "Mac OS X"
    elif "iPhone OS" in user_agent:
        os = "iOS"
    elif "Android" in user_agent:
        os = "Android"
    
    return browser, os

def get_page_random_user_agent(url):
    user_agent = get_random_user_agent()
    headers = {"User-Agent": user_agent}
    response = requests.get(url, headers=headers)
    return response

def check_user_agent_on_site(url, user_agent):
    headers = {"User-Agent": user_agent}
    response = requests.get(url, headers=headers)
    response_json = response.json()
    print("Yuborilgan User-Agent:", user_agent)
    print("Serverdan qaytgan User-Agent:", response_json["headers"]["User-Agent"])

# URL va sinov uchun funksiyalarni chaqirish
url = "https://httpbin.org/headers"
random_response = get_page_random_user_agent(url)
print("Random User-Agent bilan so'rov:", random_response.json())

for _ in range(5):
    cyclic_response = get_page_random_user_agent(url)
    print("Cycle User-Agent bilan so'rov:", cyclic_response.json())

user_agent = user_agents[0]
browser, os = analyze_user_agent(user_agent)
print("Brauzer:", browser)
print("Operatsion tizim:", os)

check_user_agent_on_site(url, user_agent)

Bu dastur yordamida User-Agent larni sozlash va o‘zgartirish, tahlil qilish va aylanma usulda ishlatish bo‘yicha amaliyotlar bilan tanishasiz.

Last updated