JavaScript organizes URLs based on their domain and directory

Is there a way to group URLs from a sorted list by domain and directory?

  • When two URLs share the same directory (the first one after the domain), they should be grouped together in an array;

  • If URLs have different first directories but the same domain, they should also be grouped in an array;

Take the following list of URLs as an example:

var url_list = ["https://www.facebook.com/impression.php/f2e61d9df/?lid=115",
"https://www.facebook.com/plugins/like.php?app_id=5",
"https://www.facebook.com/tr/a/?id=228037074239568",
"https://www.facebook.com/tr/b/?ev=ViewContent",
"http://www.marvel.com/abc?f=33",
"http://www.marvel.com/games?a=11",
"http://www.marvel.com/games?z=22",
"http://www.marvel.com/videos"]

They should be grouped as shown below:

var group_url = [
    ["https://www.facebook.com/impression.php/f2e61d9df/?lid=115","https://www.facebook.com/plugins/like.php?app_id=5",],
    ["https://www.facebook.com/tr/a/?id=228037074239568","https://www.facebook.com/tr/b/?ev=ViewContent"],
    ["http://www.marvel.com/abc?f=33","http://www.marvel.com/videos"],
    ["http://www.marvel.com/games?a=11","http://www.marvel.com/games?z=22"]
]

I attempted to write code to group URLs by domain, but could not find a way to also consider directories:

var group_url = [];
var count = 0;
var url_list = ["https://www.facebook.com/impression.php/f2e61d9df/?lid=115",
  "https://www.facebook.com/plugins/like.php?app_id=5",
  "https://www.facebook.com/tr/?id=228037074239568",
  "https://www.facebook.com/tr/?ev=ViewContent",
  "http://www.marvel.com/abc?f=33",
  "http://www.marvel.com/games?a=11",
  "http://www.marvel.com/games?z=22",
  "http://www.marvel.com/videos"]
      
for(i = 0; i < url_list.length; i++) {
  if(url_list[i] != "") {
    var current = url_list[i].replace(/.*?:\/\//g, "");
    var check = current.substr(0, current.indexOf('/'));
    group_url.push([])
    for(var j = i; j < url_list.length; j++) {
      var add_url = url_list[j];
      if(add_url.indexOf(check) != -1) {
        group_url[count].push(add_url);
        url_list[j] = "";
      }
      else {
        break;
      }
    }
    count += 1;
  }
}
    
console.log(JSON.stringify(group_url));

Answer №1

If you're looking to organize the URLs by domain+dir and then group those that are alone in their group by domain only, you can achieve that with the following ES5 script:

var url_list = ["https://www.facebook.com/impression.php/f2e61d9df/?lid=115",
"https://www.facebook.com/plugins/like.php?app_id=5",
"https://www.facebook.com/tr/a/?id=228037074239568",
"https://www.facebook.com/tr/b/?ev=ViewContent",
"http://www.marvel.com/abc?f=33",
"http://www.marvel.com/games?a=11",
"http://www.marvel.com/games?z=22",
"http://www.marvel.com/videos"];

// Group the URLs, keyed by domain+dir
var hash = url_list.reduce(function (hash, url) {
    // ignore protocol, and extract domain and first dir:
    var domAndDir = url.replace(/^.*?:\/\//, '').match(/^.*?\..*?\/[^\/?#]*/)[0];
    hash[domAndDir] = (hash[domAndDir] || []).concat(url);
    return hash;
}, {});

// Regroup URLs by domain only, when they are alone for their domain+dir
Object.keys(hash).forEach(function (domAndDir) {
    if (hash[domAndDir].length == 1) {
        var domain = domAndDir.match(/.*\//)[0];
        hash[domain] = (hash[domain] || []).concat(hash[domAndDir]);
        delete hash[domAndDir];
    }
});
// Convert hash to array
var result = Object.keys(hash).map(function(key) {
    return hash[key];
});

// Output result
console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }

Note: I stuck to ES5 as per your comment, but you might want to consider using ES6 Map for this task, as it is better suited for handling such a hash.

Answer №2

updatedURLList = urlList.map(function(item){return item.split("://")[1].split("/");});//exclude protocol, split by /

var object={};
for(var i in updatedURLList){
var individualItem=updatedURLList[i];
    object[individualItem[0]]=object[individualItem[0]]||{};
    object[individualItem[0]][individualItem[1]]=object[individualItem[0]][individualItem[1]]||[];
    object[individualItem[0]][individualItem[1]].push(urlList[i]);
 }

Usage example:

object["www.facebook.com"];//{plugins:[],tr:[]}
object["www.facebook.com"]["tr"];//[url1,url2]

http://jsbin.com/qacasexowi/edit?console please enter "result"

Answer №3

If you're looking for a reliable way to work with URLs, I highly recommend checking out the fantastic URI.js library. It provides powerful tools for parsing, querying, and manipulating URLs. You can find more information about it on their official website:

One useful feature of URI.js is its ability to handle paths in a straightforward manner. Here's an example straight from the API documentation:

var uri = new URI("http://example.org/foo/hello.html");
// get pathname
uri.pathname(); // returns "/foo/hello.html"
// set pathname
uri.pathname("/foo/hello.html"); // returns the URI instance for chaining

// encoding
uri.pathname("/hello world/");
uri.pathname() === "/hello%20world/";
// decoding
uri.pathname(true) === "/hello world/";

// handling empty paths
URI("").path() === "";
URI("/").path() === "/";
URI("http://example.org").path() === "/";

Once you get the hang of it, working with URI.js should be a breeze.

Answer №4

A recommended approach is to utilize an object and group the data by domain and the first string after the domain. This can then be iterated through to transform the data into the desired structure.

This method can handle unsorted data as well.

var url_list = ["https://www.facebook.com/impression.php/f2e61d9df/?lid=115", "https://www.facebook.com/plugins/like.php?app_id=5", "https://www.facebook.com/tr/a/?id=228037074239568", "https://www.facebook.com/tr/b/?ev=ViewContent", "http://www.marvel.com/abc?f=33", "http://www.marvel.com/games?a=11", "http://www.marvel.com/games?z=22", "http://www.marvel.com/videos"],
    temp = [],
    result;

url_list.forEach(function (a) {
    var m = a.match(/.*?:\/\/([^\/]+)\/?([^\/?]+)?/);
    m.shift();
    m.reduce(function (r, b) {
        if (!r[b]) {
            r[b] = { _: [] };
            r._.push({ name: b, children: r[b]._ });
        }
        return r[b];
    }, this)._.push(a);
}, { _: temp });

result = temp.reduce(function (r, a) {
    var top = [],
        parts = [];

    a.children.forEach(function (b) {
        if (b.children.length === 1) {
            top.push(b.children[0]);
        } else {
            parts.push(b.children);
        }
    });
    return top.length ? r.concat([top], parts) : r.concat(parts);
}, []);

console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }

Answer №5

This code snippet is designed specifically to meet your requirements:

var url_list = ["https://www.example.com/page1",
"https://www.example.com/page2",
"https://www.example.com/page3",
"https://www.example.com/page4",
"http://www.test.com/sample1",
"http://www.test.com/sample2",
"http://www.test.com/sample3",
"http://www.test.com/sample4"];

var folderGroups = {};
for (var i = 0; i < url_list.length; i++) {

  var myRegexp = /.*\/\/[^\/]+\/[^\/\?]+/g;
  var match = myRegexp.exec(url_list[i]);
  var keyForUrl = match[0];
  if (folderGroups[keyForUrl] == null) {
    folderGroups[keyForUrl] = [];
  }
  folderGroups[keyForUrl].push(url_list[i]);
}

var toRemove = [];
Object.keys(folderGroups).forEach(function(key,index) {
    if (folderGroups[key].length == 1) {
      toRemove.push(key);
    }
});
for (var i = 0; i < toRemove.length; i++) {
  delete folderGroups[toRemove[i]];
}

//console.log(folderGroups);

var domainGroups = {};
for (var i = 0; i < url_list.length; i++) {
 //Check if collected previously
  var myRegexpPrev = /.*\/\/[^\/]+\/[^\/\?]+/g;
  var matchPrev = myRegexpPrev.exec(url_list[i]);
  var checkIfPrevSelected = matchPrev[0];
  debugger;
  if (folderGroups[checkIfPrevSelected] != null) {
    continue;
  }
  //Get for domain group
  var myRegexp = /.*\/\/[^\/]+/g;
  var match = myRegexp.exec(url_list[i]);
  var keyForUrl = match[0];
  if (domainGroups[keyForUrl] == null) {
    domainGroups[keyForUrl] = [];
  }
  domainGroups[keyForUrl].push(url_list[i]);
}

//console.log(domainGroups);

var finalResult = {};
$.extend(finalResult, folderGroups, domainGroups);
console.log(Object.values(finalResult));
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Load components in NextJS lazily without utilizing `next/dynamic`

Currently, I am in the process of developing a component (Editor) that should always be lazy-loaded in React applications. However, I need it to be compatible with any React app. While I have successfully implemented lazy loading with Create React App usi ...

I am unable to retrieve dynamic data from the database

Storing data in a MySQL database has been successful for me. I've been utilizing PDO in PHP to fetch the data and then converting it to JavaScript using json_encode. However, I keep encountering the output NaN when implementing a specific scenario. It ...

Utilizing Ajax for Efficiently Updating a Singular Field within a Designated Object

How can I use Ajax to Update a Single Field in a Specific Object? I have a table in my postgres database with numerous records. I am interested in using a jquery Ajax request to update just one field in a particular object within that table. Is it possibl ...

Evaluate each element of the array against the corresponding key in a hash, adding the corresponding value to a

SCORECARD = { "1" => 40, "2" => 100, "3" => 300, "4" => 1200 } def get_score(arr) score_base = [] #level = getPoints(arr) / 10 score_base = calculateScore(arr) end def getPo ...

Nodejs Websocket integration with the Firefox browser

Currently, I am utilizing Aurora 17 along with Chrome 22 and Firefox 16 in order to develop a basic chat application. My server-side technology is Node 0.8.9. The issue I am experiencing pertains specifically to Firefox, as it fails to establish a connect ...

Finding a JSON file within a subdirectory

I am trying to access a json file from the parent directory in a specific file setup: - files - commands - admin - ban.js <-- where I need the json data - command_info.json (Yes, this is for a discord.js bot) Within my ban.js file, I hav ...

Switching from map view to satellite view on Google Maps allows you to see detailed aerial

Is there a way to switch from map view to satellite view on a Google Map using JavaScript after zooming in 100%? If so, how can it be achieved within the following JavaScript code? DEMO:http://jsfiddle.net/keL4L2h0/ // Load Google Map ///////////////// ...

Error in Next.js when the value of the target does not change in React-Select's

I am brand new to the world of react/nextjs and I keep encountering a frustrating error that says "Cannot read properties of undefined (reading 'value')". Despite trying various approaches, including treating select as a simple HTML tag, I have s ...

Obtain the sender using an href JavaScript code

Is there a way to retrieve the tag that called a javascript function if the anchor has a href tag of javascript:someFunc()? I know that in the onclick attribute, you can pass this, but when called from the href tag, this references the DOMWindow. Unfortu ...

"Troubleshooting: Ajax File Uploader Plugin Not Functioning Properly

Today, our web site's file upload feature using the javascript plugin Simple-ajax-uploader suddenly stopped functioning (09/05/2019). The upload div/button is unresponsive when clicked. This issue is not limited to our site; even the official plugin ...

Building an anchor tag that employs the HTTP DELETE method within an Express.js app

Recently, I delved into using express.js with handlebars.js as my template engine. One task I wanted to tackle was creating a delete link that followed RESTful principles and used the HTTP DELETE verb instead of GET. After some trial and error, I discover ...

Using JSON parsing to dynamically create classes with preloaded background images

Today, I successfully deployed my browser game using MVC4 to my website for the first time. I am currently navigating through the differences between running the site off of localhost and running it from the actual website. My process involves loading all ...

Code for a regular expression that permits either letters or numbers with symbols

Here is the code snippet I am using for data validation in jQuery: return /^(?=.*[A-Za-z0-9/\$#.-_])[A-Za-z0-9/\$#.-_]$/i.test(value) The requirement is that the value should begin with letters or numbers, or a combination of both. Afterwards, ...

Unable to load page redirection

I am experiencing an issue with this page not redirecting to the appropriate mobile or desktop page when accessed. Below is the code snippet in question: <html> <head> <title>Loading...</title> </head> < ...

Click event to reset the variable

The code snippet in Javascript below is designed to execute the function doSomethingWithSelectedText, which verifies if any text is currently selected by utilizing the function getSelectedObj. The getSelectedObj function returns an object that contains in ...

What is the process for deducting the ordered quantity from the available quantity once an order is confirmed

Although I'm not a fan of hard coding, I've been struggling to find a solution to my problem and some explanations are just not clicking for me. The issue at hand involves three data products in the cart, product details, and placed order data f ...

Is there a substitute for AngularJS $watch in Aurelia?

I'm in the process of transitioning my existing Angular.js project to Aurelia.js. Here is an example of what I am trying to accomplish: report.js export class Report { list = []; //TODO listChanged(newList, oldList){ ...

I am experiencing an issue where the Mongo item ID is not being successfully passed through the REST API

Recently, I've been working on developing a blog site for my class project. The main goal is to create a REST API for the blog site. I have successfully managed to display data from the database using .ejs views, except for one issue. I am facing diff ...

Customize the appearance of the "Collapse" component in the antd react library by overriding the default styles

Incorporating JSX syntax with *.css for my react component. Displayed below is the jsx code for the antd collapse section. <Collapse defaultActiveKey={["1"]} expandIconPosition="right" > <Panel header="This is p ...

Creating Canvas dimensions to match the dimensions of a rectangle specified by the user

When the code below runs, it correctly generates a rectangle the same size as the canvas on start-up. However, an issue arises when the user clicks the button to generate a new canvas - the rectangle does not appear. Can someone please provide assistance ...