Tits In Tops Thread Scraper Tool / How to download TiTs pictures?

DownThemAll! works. Use the 'Links' tab, as the 'Media' tab will just download the thumbnails.
Thanks so much again for this man, any chance you've gotten it to work well for Coomer? When I try to do it all it does is get the lower-quality preview images, and it doesn't get any video files or images from the multi-photo posts. Is it required with something like that to open each and every post page before running DTA?
 
Thanks so much again for this man, any chance you've gotten it to work well for Coomer?

I'd never even heard of that website before I read your post. Another member just described it as 'beyond illegal' so now I'm interested!
 
I'd never even heard of that website before I read your post. Another member just described it as 'beyond illegal' so now I'm interested!
Where'd you see this member describing it as such? Curious as I'd like to pick their brain.

My college thesis centered on DMCA, SOPA and other laws around internet "piracy". Two tautonomies I still carry with me from it are that

1. Accessing content - whether pirated, assumed pirated, or not - is never prosecutable unless the content itself is illegal (i.e. underage nudity, etc.), and ...

2. These laws almost never hold up to constitutional/jurisprudential scrutiny and the manner in which they're invoked/applied without due process by large content hosts (YouTube, etc.) and ISPs is generally beyond illegal itself for a number of reasons.
 
First of all, thank you mason2371 for sharing this great little tool :)

The issue I need some help with is this: main.py: error: argument -o/--output-directory: Not a directory:

I'm wondering what the best approach is to automatically create the OUTPUT_DIRECTORY if the script finds that the folder does not already exist. I'd like to keep my scraped threads separated nicely, but I'd like to avoid browsing to my download location each time and create folders manually.

As I understand it the folder check happens when the argument is parsed here:

Python:
parser.add_argument(
        "-o",
        "--output-directory",
        type=file_validator(directory=True),
        default=".",
        help="Directory to download media files to. Default is '.'",
    )

I'm trying to understand if I can just comment out the validation check and then add a folder creation command like this: https://www.geeksforgeeks.org/how-to-create-directory-if-it-does-not-exist-using-python/
 
First of all, thank you mason2371 for sharing this great little tool :)

The issue I need some help with is this: main.py: error: argument -o/--output-directory: Not a directory:

I'm wondering what the best approach is to automatically create the OUTPUT_DIRECTORY if the script finds that the folder does not already exist. I'd like to keep my scraped threads separated nicely, but I'd like to avoid browsing to my download location each time and create folders manually.

As I understand it the folder check happens when the argument is parsed here:

Python:
parser.add_argument(
        "-o",
        "--output-directory",
        type=file_validator(directory=True),
        default=".",
        help="Directory to download media files to. Default is '.'",
    )

I'm trying to understand if I can just comment out the validation check and then add a folder creation command like this: https://www.geeksforgeeks.org/how-to-create-directory-if-it-does-not-exist-using-python/
Personally as a Unix nerd, my solution would be to create that directory first before running the scraper. You could also write a script that ensures that the output directory is created before running the scraper. However, I understand that people have different requirements, and some of us are even forced to do terrible things, like use MS Windows. So for that, yes you can remove the
Code:
type=file_validator(directory=True),
line and it should work fine. However, that line is there to ensure you didn't mess up your command and type the wrong directory, or worse yet type the name of an existing file. You could also add the code from that article to the script if you'd like. I would recommend you add it to the file_validator function so you can still detect errors such as trying to save into an existing file. That would look something like this:
Python:
def file_validator(directory: bool) -> Callable[[str], pathlib.PurePath]:
    """
    Convert a path string to a Path object. Raises ArgumentTypeError if the path does
    not exist, or if the path is not the correct type of file as specified by the
    directory argument. Meant to be used as the type argument for an argparse arg.
    """

    def _validator(pathstr: str):
        path = pathlib.Path(pathstr)
        if directory:
            try:
                path.mkdir(parents=True, exist_ok=True)
            except FileExistsError:
                raise argparse.ArgumentTypeError(f"Not a directory: {pathstr}")
        else:
            if not path.is_file():
                raise argparse.ArgumentTypeError(f"File not found: {pathstr}")
        return path.resolve()

    return _validator
 
Personally as a Unix nerd, my solution would be to create that directory first before running the scraper. You could also write a script that ensures that the output directory is created before running the scraper. However, I understand that people have different requirements, and some of us are even forced to do terrible things, like use MS Windows. So for that, yes you can remove the
Code:
type=file_validator(directory=True),
line and it should work fine. However, that line is there to ensure you didn't mess up your command and type the wrong directory, or worse yet type the name of an existing file. You could also add the code from that article to the script if you'd like. I would recommend you add it to the file_validator function so you can still detect errors such as trying to save into an existing file. That would look something like this:
Python:
def file_validator(directory: bool) -> Callable[[str], pathlib.PurePath]:
    """
    Convert a path string to a Path object. Raises ArgumentTypeError if the path does
    not exist, or if the path is not the correct type of file as specified by the
    directory argument. Meant to be used as the type argument for an argparse arg.
    """

    def _validator(pathstr: str):
        path = pathlib.Path(pathstr)
        if directory:
            try:
                path.mkdir(parents=True, exist_ok=True)
            except FileExistsError:
                raise argparse.ArgumentTypeError(f"Not a directory: {pathstr}")
        else:
            if not path.is_file():
                raise argparse.ArgumentTypeError(f"File not found: {pathstr}")
        return path.resolve()

    return _validator

Thank you very much! This modification works like a charm. Didn't consider combining the path.mkdir command with the file validator so this is even better.

While my PC is running Windows, I'm actually hosting the script on my NAS which is running Linux. Since my last post, I went down a bit of a rabbit hole. I was able to import the script into script-server on my NAS. Now I can use it from my desktop browser without needing to open a terminal each time. :geek: I included the script-server config file below in case anyone would like to use it.

Learning a lot this week all in the name of boobs... hah...

JSON:
{
  "name": "Tits In Tops Thread Scraper Tool",
  "script_path": "python3 main.py",
  "working_directory": "/app/scripts/titsintops",
  "description": "By mason2371",
  "group": "scrapers",
  "output_format": "terminal",
  "parameters": [
    {
      "name": "Dry run",
      "param": "--dry-run",
      "no_value": true,
      "description": "Do not write downloaded media to disk"
    },
    {
      "name": "Username",
      "required": true,
      "param": "--username",
      "type": "text",
      "default": "CHANGE ME",
      "constant": true
    },
    {
      "name": "Password",
      "required": true,
      "param": "--password",
      "type": "text",
      "default": "CHANGE ME",
      "constant": true
    },
    {
      "name": "Verbose",
      "param": "--verbose",
      "no_value": true,
      "description": "Print more detailed diagnostic messages"
    },
    {
      "name": "Quiet",
      "param": "--quiet",
      "type": "text",
      "no_value": true,
      "description": "Do not print standard output messages. Does not affect --verbose diagnostic messages"
    },
    {
      "name": "Overwrite",
      "param": "--clobber",
      "no_value": true,
      "description": "Overwrite files that already exist in output directory. Default is to save files with a new name instead."
    },
    {
      "name": "Link List",
      "param": "--links",
      "no_value": false,
      "default": "links.txt",
      "constant": true,
      "description": "Write links posted in thread to a file"
    },
    {
      "name": "Archive Enabled",
      "param": "--archive",
      "type": "text",
      "default": "archive.txt",
      "constant": true,
      "description": "Record downloaded media to a file, and skip media files already listed in the archive"
    },
    {
      "name": "Paginate",
      "required": false,
      "param": "--paginate",
      "no_value": true,
      "description": " Store files in directories for each thread page. Useful for extremely large threads as thousands of files in the same directory tends to cause performance issues."
    },
    {
      "name": "Output Folder",
      "required": true,
      "param": "--output-directory",
      "type": "text",
      "default": "unsorted",
      "description": "Directory to download media files to. Default is '.'"
    },
    {
      "name": "Thread",
      "required": true,
      "type": "text",
      "description": "URL of the thread to download"
    }
  ]
}
 
Thank you very much! This modification works like a charm. Didn't consider combining the path.mkdir command with the file validator so this is even better.

While my PC is running Windows, I'm actually hosting the script on my NAS which is running Linux. Since my last post, I went down a bit of a rabbit hole. I was able to import the script into script-server on my NAS. Now I can use it from my desktop browser without needing to open a terminal each time. :geek: I included the script-server config file below in case anyone would like to use it.

Learning a lot this week all in the name of boobs... hah...

JSON:
{
  "name": "Tits In Tops Thread Scraper Tool",
  "script_path": "python3 main.py",
  "working_directory": "/app/scripts/titsintops",
  "description": "By mason2371",
  "group": "scrapers",
  "output_format": "terminal",
  "parameters": [
    {
      "name": "Dry run",
      "param": "--dry-run",
      "no_value": true,
      "description": "Do not write downloaded media to disk"
    },
    {
      "name": "Username",
      "required": true,
      "param": "--username",
      "type": "text",
      "default": "CHANGE ME",
      "constant": true
    },
    {
      "name": "Password",
      "required": true,
      "param": "--password",
      "type": "text",
      "default": "CHANGE ME",
      "constant": true
    },
    {
      "name": "Verbose",
      "param": "--verbose",
      "no_value": true,
      "description": "Print more detailed diagnostic messages"
    },
    {
      "name": "Quiet",
      "param": "--quiet",
      "type": "text",
      "no_value": true,
      "description": "Do not print standard output messages. Does not affect --verbose diagnostic messages"
    },
    {
      "name": "Overwrite",
      "param": "--clobber",
      "no_value": true,
      "description": "Overwrite files that already exist in output directory. Default is to save files with a new name instead."
    },
    {
      "name": "Link List",
      "param": "--links",
      "no_value": false,
      "default": "links.txt",
      "constant": true,
      "description": "Write links posted in thread to a file"
    },
    {
      "name": "Archive Enabled",
      "param": "--archive",
      "type": "text",
      "default": "archive.txt",
      "constant": true,
      "description": "Record downloaded media to a file, and skip media files already listed in the archive"
    },
    {
      "name": "Paginate",
      "required": false,
      "param": "--paginate",
      "no_value": true,
      "description": " Store files in directories for each thread page. Useful for extremely large threads as thousands of files in the same directory tends to cause performance issues."
    },
    {
      "name": "Output Folder",
      "required": true,
      "param": "--output-directory",
      "type": "text",
      "default": "unsorted",
      "description": "Directory to download media files to. Default is '.'"
    },
    {
      "name": "Thread",
      "required": true,
      "type": "text",
      "description": "URL of the thread to download"
    }
  ]
}
Nice. Tits are of course the best motivator. And thanks for that github link, never heard of that project but I might have to set that up on my own server now.
 
I'm so happy I visited this subforum. I didn't know that we had a tool like this here. Going to have to try this because there are quite a few threads I'd love to scrape quickly. At one point in the past I was looking up forum scrapers that might work on all Xenforo forums.
 
Getting SSL certificate failed error.

File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='titsintops.com', port=443): Max retries exceeded with url: /phpBB2/index.php?login/ (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1006)')))
 
Getting SSL certificate failed error.

File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='titsintops.com', port=443): Max retries exceeded with url: /phpBB2/index.php?login/ (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1006)')))
Something got messed up with the site. For a while it said it was not available to me when I tried to visit today. And I had to clear out my cookies and site data and then explicitly tell my browser to proceed after it blocked me going forward. The certificate of the site might just need updating.

Unfortunately it does break the script. Hoping whatever changed/broke is resolved soon. This is such a great utility.

.
It's working again. The site certificate issue is resolved.
 
Last edited:
Something got messed up with the site. For a while it said it was not available to me when I tried to visit today. And I had to clear out my cookies and site data and then explicitly tell my browser to proceed after it blocked me going forward. The certificate of the site might just need updating.

Unfortunately it does break the script. Hoping whatever changed/broke is resolved soon. This is such a great utility.

.
It's working again. The site certificate issue is resolved.
I thought it was just me. I'm still getting multiple errors and have to clear browsing data just to get access again. It took me 30 mins to type a comment earlier.
 
im too dumb to know how to get this to work :ROFLMAO:
it would be great if this worked with cyberdrop :emoji_pray:
 
I thought it was just me. I'm still getting multiple errors and have to clear browsing data just to get access again. It took me 30 mins to type a comment earlier.
Might be some server issues happening. I haven't had that same error again but there are times when the site is really slow to load or some links just won't open. Even the alerts dropdown will take a long time sometimes.
 
Top