Crawling BitTorrent DHTs for Fun and Profit [pdf]

(usenix.org)

39 points | by dgellow 3 days ago

5 comments

  • hdgr 2 hours ago
    Bitmagnet -https://bitmagnet.io/ - does exactly that. I left it running for a few weeks and then stopped the crawler. Didn't expect much, but still somewhat disappointed by the garbage it reeled in.
    • NegativeLatency 2 hours ago
      I've had one running for over a year now, it's replaced my usage of regular torrent sites completely, there is a lot of junk, and it gets stale, but it's still a better experience than most of the public trackers out there IMO
      • qingcharles 1 hour ago
        Are you running it at home?

        I built one with a nice TUI to run on a VPS so I can try and find rare magazine torrents, but Hetzner were upset about it. I need to find it a new home. It was a very good citizen, but it still raised too many flags.

        • k4rli 1 hour ago
          Runs fine at home. I've indexed 20M+ torrents in last few months running it during the day. With Prowlarr (or similar) it could easily replace other indexers.
        • NoMoreNicksLeft 1 hour ago
          Which magazines?
          • qingcharles 43 minutes ago
            Anything I don't have! Sometimes I'll find a torrent and no seeds/peers and I'll wonder if there is another torrent out there that has the same files in it somewhere that I can find.

            The other day it was trying to track down some older High Times issues that were torrented but the torrent is dead. Last night it was a mag titled Films & Filming which I know is scanned, but I can't find anywhere.

            • toomuchtodo 22 minutes ago
              Can I get a copy for the Internet Archive? Will take as much of the corpus as you’re willing to provide.

              (no affiliation with them)

      • drdexebtjl 1 hour ago
        I disabled mine because it was constantly writing to my SSD.
        • felooboolooomba 1 hour ago
          I solved it by storing the data on /dev/null
          • Daviey 39 minutes ago
            The writes are insanely fast.
          • permalac 52 minutes ago
            Pretty big space.
    • qingcharles 42 minutes ago
      This is a good tip, thanks. I'll probably replace my home-grown scanner for this one.
  • gritzko 1 hour ago
    2010. I remember those times. I was doing these things for science in 2008. Performance-wise, PEX was much faster than DHT. At least, in my setting.

    This year, I was giving it as an assignment to students. Does not take much time with LLMs.

  • MoonWalk 1 hour ago
    The article neglects to define "DHT" before using it.
    • ivanjermakov 48 minutes ago
      Distributed hash table - ButTorrent extension for discovering torrent's seeders by advertising its hash across known peer pool, think of it as a distributed tracker. Contrary to traditional way of asking a known tracker for peers of that torrent.

      Its algorithm is very elegant, using binary search on peers' and torrents' hashes, narrowing down to peers that are more likely to be seeders (or at least know some).

      https://www.bittorrent.org/beps/bep_0005.html

      • loeg 23 minutes ago
        Not a P2P innovation with Bittorrent, FWIW. Kademlia DHT (used in eMule/LimeWire/Gnutella P2P networks) long predates Bittorrent.
  • hackingonempty 2 hours ago
    (2010)
  • Boss0565 1 hour ago
    old paper